Essential Guide

The state of the enterprise cloud and prepping for AWS re:Invent 2013

A comprehensive collection of articles, videos and more, hand-picked by our editors
Q

What are useful resources for a newcomer to data analysis techniques?

So you've collected data -- now what? With the help of these resources, data analysis techniques can bring business intelligence to an enterprise.

What are useful resources for a newcomer to data analysis techniques?

Many organizations are adept at collecting data, but the real value is only realized when the data is analyzed. Creating and maintaining a data analysis practice will require support from cloud administrators, as well as data analysts. Cloud administrators will be called on to configure systems, evaluate architectures and maintain infrastructure for data analysts. The more you know about the practice of data analysis, the better you can support it.

Using a combination of books and online tutorials while working with various tools can help you dive into data analysis while staying linked to your own real-world data analysis problems.

Many data analysis techniques are taken from statistics and machine learning. Cousera.org, the free resource for massive online courses, offers courses in computing for data analysis, mathematical modeling and statistics. Andrew Ng's course on machine learning at Cousera is well designed for students new to the topic.

Philipp Janert's book Data Analysis with Open Source Tools introduces statistical techniques along with open source tools. Wes McKinney's Python for Data Analysis: Agile Tools for Real-World Data is a good introduction to working with data in Python.

R is a widely used open source statistical analysis tool with a wide set of add-on packages. The R Tutorial is a gentle introduction to R, but it has some more advanced articles as well. The Pandas Python package has features comparable to R, and it is a good fit for Python developers that want to use Python for collecting, formatting and analyzing data.

Getting started with data-mining tools does not have to be intimidating. RapidMiner is an open source data-mining tool with an easy-to-use interface and a wide collection of research tools available.

Visualization tools such as Tableau Software, a visualization service, can help you better understand large data sets with many variables. This is a fee-for-service product, but there is a free trial if you want to give it a try.

About the author:
Dan Sullivan, M.Sc., is an author, systems architect and consultant with more than 20 years of IT experience. He has had engagements in advanced analytics, systems architecture, database design, enterprise security and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail and education. Dan has written extensively about topics that range from data warehousing, cloud computing and advanced analytics to security management, collaboration and text mining.

This was first published in August 2013

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Essential Guide

The state of the enterprise cloud and prepping for AWS re:Invent 2013

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchServerVirtualization

SearchVMware

SearchVirtualDesktop

SearchAWS

SearchDataCenter

SearchWindowsServer

SearchSOA

SearchCRM

Close