{"id":7871,"date":"2015-11-03T18:50:04","date_gmt":"2015-11-03T18:50:04","guid":{"rendered":"https:\/\/unknownerror.org\/index.php\/2015\/11\/03\/pydata-pandas\/"},"modified":"2015-11-03T18:50:04","modified_gmt":"2015-11-03T18:50:04","slug":"pydata-pandas","status":"publish","type":"post","link":"https:\/\/unknownerror.org\/index.php\/2015\/11\/03\/pydata-pandas\/","title":{"rendered":"pydata\/pandas"},"content":{"rendered":"<p><img decoding=\"async\" src=\"http:\/\/travis-ci.org\/pydata\/pandas.svg?branch=master\" \/><\/p>\n<h2>What is it<\/h2>\n<p><strong>pandas<\/strong> is a Python package providing fast, flexible, and expressive data structures designed to make working with \u201crelational\u201d or \u201clabeled\u201d data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, <strong>real world<\/strong> data analysis in Python. Additionally, it has the broader goal of becoming <strong>the most powerful and flexible open source data analysis \/ manipulation tool available in any language<\/strong>. It is already well on its way toward this goal.<\/p>\n<h2>Main Features<\/h2>\n<p>Here are just a few of the things that pandas does well:<\/p>\n<ul>\n<li>Easy handling of <strong>missing data<\/strong> (represented as <code>NaN<\/code>) in floating point as well as non-floating point data<\/li>\n<li>Size mutability: columns can be <strong>inserted and deleted<\/strong> from DataFrame and higher dimensional objects<\/li>\n<li>Automatic and explicit <strong>data alignment<\/strong>: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let <code>Series<\/code>, <code>DataFrame<\/code>, etc. automatically align the data for you in computations<\/li>\n<li>Powerful, flexible <strong>group by<\/strong> functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data<\/li>\n<li>Make it <strong>easy to convert<\/strong> ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects<\/li>\n<li>Intelligent label-based <strong>slicing<\/strong>, <strong>fancy indexing<\/strong>, and <strong>subsetting<\/strong> of large data sets<\/li>\n<li>Intuitive <strong>merging<\/strong> and <strong>joining<\/strong> data sets<\/li>\n<li>Flexible <strong>reshaping<\/strong> and <strong>pivoting<\/strong> of data sets<\/li>\n<li><strong>Hierarchical<\/strong> labeling of axes (possible to have multiple labels per tick)<\/li>\n<li>Robust IO tools for loading data from <strong>flat files<\/strong> (CSV and delimited), <strong>Excel files<\/strong>, <strong>databases<\/strong>, and saving\/loading data from the ultrafast <strong>HDF5 format<\/strong><\/li>\n<li><strong>Time series<\/strong>-specific functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging, etc.<\/li>\n<\/ul>\n<h2>Where to get it<\/h2>\n<p>The source code is currently hosted on GitHub at: http:\/\/github.com\/pydata\/pandas<\/p>\n<p>Binary installers for the latest released version are available at the Python package index<\/p>\n<pre><code>http:\/\/pypi.python.org\/pypi\/pandas\/\n<\/code><\/pre>\n<p>And via <code>easy_install<\/code>:<\/p>\n<pre><code>easy_install pandas\n<\/code><\/pre>\n<p>or <code>pip<\/code>:<\/p>\n<pre><code>pip install pandas\n<\/code><\/pre>\n<p>or <code>conda<\/code>:<\/p>\n<pre><code>conda install pandas\n<\/code><\/pre>\n<h2>Dependencies<\/h2>\n<h3>Highly Recommended Dependencies<\/h3>\n<ul>\n<li>numexpr\n<ul>\n<li>Needed to accelerate some expression evaluation operations<\/li>\n<li>Required by PyTables<\/li>\n<\/ul>\n<\/li>\n<li>bottleneck\n<ul>\n<li>Needed to accelerate certain numerical operations<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3>Optional dependencies<\/h3>\n<h4>Notes about HTML parsing libraries<\/h4>\n<ul>\n<li>\n<p>If you install BeautifulSoup4 you must install either lxml or html5lib or both. <code>pandas.read_html<\/code> will <strong>not<\/strong> work with <em>only<\/em> <code>BeautifulSoup4<\/code> installed.<\/p>\n<\/li>\n<li>\n<p>You are strongly encouraged to read HTML reading gotchas. It explains issues surrounding the installation and usage of the above three libraries.<\/p>\n<\/li>\n<li>\n<p>You may need to install an older version of BeautifulSoup4:<\/p>\n<ul>\n<li>Versions 4.2.1, 4.1.3 and 4.0.2 have been confirmed for 64 and 32-bit Ubuntu\/Debian<\/li>\n<\/ul>\n<\/li>\n<li>\n<p>Additionally, if you\u2019re using Anaconda you should definitely read the gotchas about HTML parsing libraries<\/p>\n<\/li>\n<li>\n<p>If you\u2019re on a system with <code>apt-get<\/code> you can do<\/p>\n<pre><code>sudo apt-get build-dep python-lxml\n<\/code><\/pre>\n<p>to get the necessary dependencies for installation of lxml. This will prevent further headaches down the line.<\/p>\n<\/li>\n<\/ul>\n<h2>Installation from sources<\/h2>\n<p>To install pandas from source you need Cython in addition to the normal dependencies above. Cython can be installed from pypi:<\/p>\n<pre><code>pip install cython\n<\/code><\/pre>\n<p>In the <code>pandas<\/code> directory (same one where you found this file after cloning the git repo), execute:<\/p>\n<pre><code>python setup.py install\n<\/code><\/pre>\n<p>or for installing in development mode:<\/p>\n<pre><code>python setup.py develop\n<\/code><\/pre>\n<p>Alternatively, you can use <code>pip<\/code> if you want all the dependencies pulled in automatically (the <code>-e<\/code> option is for installing it in development mode):<\/p>\n<pre><code>pip install -e .\n<\/code><\/pre>\n<p>On Windows, you will need to install MinGW and execute:<\/p>\n<pre><code>python setup.py build --compiler=mingw32\npython setup.py install\n<\/code><\/pre>\n<p>See http:\/\/pandas.pydata.org\/ for more information.<\/p>\n<h2>License<\/h2>\n<p>BSD<\/p>\n<h2>Documentation<\/h2>\n<p>The official documentation is hosted on PyData.org: http:\/\/pandas.pydata.org\/<\/p>\n<p>The Sphinx documentation should provide a good starting point for learning how to use the library. Expect the docs to continue to expand as time goes on.<\/p>\n<h2>Background<\/h2>\n<p>Work on <code>pandas<\/code> started at AQR (a quantitative hedge fund) in 2008 and has been under active development since then.<\/p>\n<h2>Discussion and Development<\/h2>\n<p>Since pandas development is related to a number of other scientific Python projects, questions are welcome on the scipy-user mailing list. Specialized discussions or design issues should take place on the PyData mailing list \/ Google group:<\/p>\n<p>https:\/\/groups.google.com\/forum\/#!forum\/pydata<\/p>\n","protected":false},"excerpt":{"rendered":"<p>What is it pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with \u201crelational\u201d or \u201clabeled\u201d data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-7871","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/7871","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/comments?post=7871"}],"version-history":[{"count":0,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/7871\/revisions"}],"wp:attachment":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/media?parent=7871"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/categories?post=7871"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/tags?post=7871"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}