Quality Digest      
  HomeSearchSubscribeGuestbookAdvertise January 21, 2020
This Month
Need Help?
ISO 9000 Database
Web Links
Back Issues
Contact Us
Departments: Quality Software

Software Reviews

STATISTICA adds data mining and neural networks.

By Felix Grant

StatSoft’s release of STATISTICA 6 last year had been a long time coming, but it was well worth the wait. Version 6 has been thoroughly rebuilt from the ground up, preserving all of STATISTICA’s strengths and adding new ones. The interface is fully in line with current Windows conventions and has adopted a new Excel-like persona. The traditional programming-languages double act has been replaced by a single and powerful STATISTICA dialect of Visual Basic (SVB).

The menu system displayed grayed-out options for which the specialized extension products were not yet available, including quality control charts, data mining and neural networks. QC charts appeared some time back, and I’ve had them doing their paces for months now. The other two arrived more recently, just as I was about to start a major investigation into fault origination for a client using another product.

QC charts access STATISTICA’s capabilities for dynamic transfer and update of data, giving a real-time view of data stream behavior, and can feed back the processed results to external events (e.g., alarms, settings or adjustments, process augmentation, scaling notifications, decision support). There must be limits to the density and variability of data flow, but I’ve not yet discovered them—despite some very demanding work that would make most software packages cry. The back end of this lot is almost completely user-definable but straightforward to use, and its public face is blessedly free of clutter or complication.

STATISTICA’s, Data Miner is the best tool I’ve seen yet for actually applying what you learn at the sharp end of industry. Once Data Miner is installed, STATISTICA loads it in the foreground by default when started (but, like almost everything else, this can be altered; STATISTICA remains one of the most customizable environments around). If you’ve used drag-and-connect nodal interfaces before, you’ll be right at home. If not, you’ll soon get the idea—this is also the easiest to use data mining control I’ve encountered, although the basic approach is widespread. Start with your data, end with a result; decide what is to be done between, and in what sequence, laying out the steps on a background; then drag arrows from one thing to the next to “join up the dots.”

The dots can be almost any tool you care to employ. They can even, in principle, lie outside STATISTICA’s own tool set, as a client/server setup allows programmed access to both data and capabilities in other programs. In practice, though, they’re most likely to represent one or more exploratory analytic routines in which STATISTICA is so strong.

In the particular case of my client, the concern was with apparently random recurring fault clusters. These clusters survived unscathed through all the company’s best efforts to control and certify every process stage. They also defied several extensive—and expensive—internal attempts to identify common factors. So, one of the most important dots to be joined up in my Data Miner window was the brand-new STATISTICA Neural Networks module.

I’m well-known for feeling uncomfortable around neural nets. I’m an old-fashioned statistician, used to knowing what’s going on. I’ll never become entirely easy with these wee-little beasties that disappear off inside their black box and emerge with an answer but can’t tell you how they got it. However, that’s my problem, not theirs; experience has taught me that, properly employed, the beasties do come up with the goods and get it right. This task, where a linkage is suspected between undesirable effect and unknown causes, buried in terabytes of data beyond human envisioning, was tailor-made for them.

As with data mining, there are a lot of excellent tools out there. They vary tremendously in every way, and each is best for some set of requirements. Overall though, SNN has for some time tended to lead the field on points, with particular strengths for unfamiliar users and those with reporting or team working priorities. This generalization becomes even stronger with the move up from v. 4 to release 6 (there was no 5; SNN climbed through a rapid series of upgrades to release 4 during the life of STATISTICA 5.X and has now aligned with the rest of StatSoft’s stable). Symbiosis with the base product is now much tighter than it used to be, providing an added advantage: Only one other product I know of can provide such complete integration with a larger and more general analytic environment, and arguably nothing can combine both with this degree of process control and response.

These new additions to the STATISTICA 6 product range take traditional quality further, make it more accessible, and introduce a new level of integration that yield considerable synergy payoffs. If you have unsolved analytical problems, try STATISTICA 6.

About the author

Felix Grant is a lecturer and consultant in the United Kingdom. Letters to the editor regarding this article can be sent to letters@qualitydigest.com.


by StatSoft Inc.

Requirements: Runs on Windows 95, 98, 2000, NT, ME or XP. 32 MB RAM. Mac OS—base version only, or any Windows version using Virtual PC (included).

Price: $795+. For data mining, neural networks and QC tool add-ons, contact StatSoft.

Contact: StatSoft Inc.
2300 East 14th St.
Tulsa, OK 74104
Phone: (918) 749-1119
Fax: (918) 749-2217
E-mail: info@statsoft.com