Load data in NumPy

We are given some data, and we want to apply some machine learning do it (classification, clustering, etc.) To do that we need to load data to NumPy’s arrays.

I browsed through NumPy’s examples and APIs and, while there are batch loading methods (load whole file into an array), there’s no method to appending a row just like Python’s built-in list does. Why?

Because NumPy’s after efficiency and performance. But to dynamically allocate space and move data around is very time-consuming. So NumPy allocates all the space beforehand.

 

install scipy/numpy on Linux

Scipy: You can try ‘yum install’. But I dont know where the installed lib went and I cant import it in my python script. Had to install from source.

You need a fortran compiler. I used yum and installed gfortran44.

Then install BLAS. So far this one been helpful: https://stackoverflow.com/questions/7496547/python-scipy-needs-blas

On 64-bit env, dont forget to add -fPIC -m64 to make.inc to fortan’s opts.

to run build Scipy, run ‘python setup.py build’. If it complained about not finding fortran compiler, do a symbolic link to your favorite fortran compiler. e.g. sudo lnk -s /usr/bin/gfortran44 /usr/bin/gfortran then I was able to finish compiling

This link may also be helpful: http://bickson.blogspot.com/2011/02/installing-blaslapackitpp-on-amaon-ec2.html