Tax collects Linux for open source analytics
- 19 September, 2005 07:15
- Comments
Despite its strong ties to proprietary software vendors, the Australian Taxation Office has finally dipped its toes into open source by establishing a Linux-based system for information analytics.
Tax has had a Teradata warehouse for 10 years but the change program initiated by chief knowledge officer Philip Hind in November last year is a "series of activities to enhance the ATO's ability to collect and analyse data", assistant commissioner of information management John Body told Computerworld.
As a result the ATO invested in Teradata's Warehouse Miner, SAS Enterprise Miner, and hired CSIRO's principal computer scientist for enterprise data mining Dr Graham Williams who pioneered open source analytics for organizations like the Health Insurance Commission and NRMA.
Tax implemented the system in March and now has a number of open source products for analytics running on a Debian GNU/Linux stand-alone server. A spokesperson for the ATO said part of the advantage of the GNU/Linux environment is that it ships with a large collection of basic tools that interoperate, like Emacs, Vim, Perl and cvs.
"These provide many basic data manipulation capabilities over the very large datasets that we are dealing with for our analytics work," the spokesperson said.
In addition to the basic tools, Tax has also deployed open source data mining and data manipulation packages, including "R" - a statistical and data mining package widely used in industry and academia that supports modern data mining approaches and graphing capabilities. R is used for data summarization, manipulation and cleaning, modeling, and model evaluation.
Tax is also using Weka, a Java-based data mining toolkit with more than 60 traditional and modern analytic tools.
For development, Tax is using Python scripting language, which is "ideally suited to automated data manipulation and transformation".
Analytics looks at the relationships within the data to make it easier for clients to pay tax, and the ATO to identify cheats. Tax has been looking through data for some considerable time but data mining with more sophisticated algorithms is a recent initiative.
The ATO's assistant commissioner of analytics, Stuart Hamilton, said the Office is using open source because a lot of the newer algorithms and techniques are available before they make it into enterprise software.
"The data is explored with the open source tools and then SAS is used to do predictive modelling," Hamilton said, adding that the system seems to be "working satisfactorily".
"Open source provides some advantages in terms of flexibility and costs but we can't say it is industrial-strength enough to handles millions of records with hundreds of conditions."
Even with the early successes, open source is still strictly limited to a stand-alone system and is not allowed on the main network "because of some issues involved in using open source".
"For example, if something goes wrong, where is your fallback?" Hamilton said.
Hamilton said Tax will investigate open source further as the new analytics system forms part of a trial of certain software to evaluate risk and fit for purpose.
Open source is being used at the "innovative edge" at the ATO and Hamilton thinks it unlikely that within a year open source will be used on the main computing platforms.
"Trials will continue and we may find niches," he said.
- Bookmark this page
- Share this article
- Got more on this story? Email Computerworld
- Follow Computerworld on twitter
- Stopping Fake Antivirus: How to Keep Scareware off Your Network
- The Pathways ICT Leadership Development Program | Turning today’s ICT professionals into tomorrow’s business leaders | 2012 Course Curriculum
- Book 3 - The Executive’s Guide to Managing Risks
- IBM agility@scale™: Become as Agile as You Can Be
- IDC Whitepaper: Next Generation Firewall - Enabling New Security Strategies
-
Customer service still dogs Telstra
-
Customer service still dogs Telstra
-
Foxtel subscriber base grows
-
Obama's H-1B answer in forum may haunt him
-
NBN a pie in the sky: Morgan
-
Windows 7 for Dummies® Dvd+book Bundle
-
Teach Yourself Visually Windows 7
-
MYOB Software for Dummies 6E Australian Edition
-
Computers for Seniors for Dummies, 2nd Edition
-
Windows 7 for Seniors for Dummies®
-
Microsoft Office
-
Office 2007 for Dummies
-
Windows 7 for Dummies®
-
Excel 2007 All-In-One Desk Reference for Dummies












Comments
Post new comment