Data Visualization with Python and JavaScript
http://shop.oreilly.com/product/0636920037057.do?cmp=tw-data-books-videos-product-na_book_video_tweet, With Early Release ebooks, you get books in their earliest form—the author's raw and unedited content as he or she writes—so you can take advantage of these technologies long before the official redata visualization withpython and javascriptCrafting a Data-visualisationToolchain for the WebKyran daleBeijing· Boston· Farnham· Sebastopol.oko○ REILLY°Data Visualization with Python and JavaScriptby Kyran daleCopyright o 2016 Kyran Dale. All rights reservedPrinted in the united states of amerPublished by O Reilly media, Inc, 1005 Gravenstein Highway North, Sebastopol,CA95472OReilly books may be purchased for educational, business, or sales promotional useOnlineeditionsarealsoavailableformosttitleshttp://safaribooksonline.com).Foron, contact our corporate/institutional sales department800-998-9938orcorporate@oreilly.comEditors: Dawn Schanafelt and Meghan Proofreader FILL IN PROOFREADERBlanchetteIndexer: FILL IN INDEXERProduction Editor: FILL IN PRODUCInterior designer David FutatoTION EDITORCover Designer: Karen MontgomeryCopyeditor: FILL IN COPYEDITORllustrator: Rebecca demarestJanuary -4712First editionRevision History for the First Edition2016-02-22: First Early Release.016-03-21: Second early releaseSeehttp://oreilly.com/catalog/errata.csp?isbn=9781491956434forreleasedetailsThe O Reilly logo is a registered trademark of O Reilly Media, Inc. Data Visualization with Python and Java Script, the cover image, and related trade dress are trade-marks of O Reilly media, IncWhile the publisher and the author(s) have used good faith efforts to ensure that theinformation and instructions contained in this work are accurate, the publisher andthe author(s) disclaim all responsibility for errors or omissions, including withoutlimitation responsibility for damages resulting from the use of or reliance on thisrk. Use of the information and instructions contained in this work is at your ownrisk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is yourresponsibility to ensure that your use thereof complies with such licenses and/orrights978-1-491-95643-4IFILL INTable of contentsIntroduction1. A Development Setup23Python23javascrip26Databases28Integrated Development Environments28ummar29Partl. a Basic toolkit2. A Language Learning Bridge Between Python and JavaScript... 33Similarities and differences33Interacting with the Code35Basic bridge work37Differences in practice62a Cheatsheet73Summary763. Reading and Writing Data with PythonEasy does it77Passing Data Around78Working with System Files79CSV TSV and row-column Data-formats80JSON83SQL86MongoDB97Dealing with Dates, Times and Complex data102Summary1044. Webdev 101105The Big picture105Single-page Apps06Tooling ur106Building a Web-pagel11Chromes developer tools119A Basic Page with Placeholders122Scalable vector Graphics(SvG)127oumar142Part l. Getting your data5. Getting Data off the Web with Python.●鲁dn,145Getting Web-data with the requests library145Getting Data-files with requests146Using Python to Consume Data from a Web-API149Using libraries to access Web-APIs155Scraping data160Summar1736. Heavyweight Scraping with Scrapy···;···········;········175Setting up Scrap176Establishing the Targets177Targeting HTML with Xpaths179A First Scrapy spider183Scraping the Individual Biography Pages189Chaining requests and Yielding data192Scrapy pipelines196Scraping Text and Images with a Pipeline198Summary204Table of contentsntroductionThis book aims to get you up to speeed with what is, in my opinion,the most powerful data -visualisation stack going: Python and java-Script. You'll learn enough of big libraries like Pandas and D3 tostart crafting your own web data-visualisations and refining yourown toolchain. Expertise will come with practice but this bookpresents a shallow learning curve to basic competenceIf you're reading this in Early Release form Idlove to hear any feedback you have. Please postit to pyjsdatavizakyrandale. com. Thanks a lot,L Kyran1'l1ou ll also find a working copy of thevisualisation the book literally and figurativebuildstowardsathttp://kyrandale.com/static/pyjsdataviz/index. htmlThe bulk of this book tells one of the innumerable tales of data-visualisation, one carefully selected to showcase some powerfulPython and JavaScript libraries or tools which together form a toolchain. This toolchain gathers raw, unrefined data at its start anddelivers a rich, engaging web-visualisation at its end. like all tales ofdata-visualisation it is a tale of transformation, in this case trans-forming a basic Wikipedia list of Nobel prize-winners into an inter-active visualisation, bringing the data to life and making explorationof the prizes history easy and funA primary motivation for writing the book is the belief that, whatever data you have, whatever story you want to tell with it, the natural home for the visualizations you transform it into is the web. as adelivery platform it is orders of magnitude more powerful than whatcame before and this book aims to smooth the passage from desktopor server-based data analysis and processing to getting the fruits ofthat labour out on the webBut the most ambitious aim of this book is to persuade you thatworking with these two powerful languages towards the goal ofdelivering powerful web-visualisations is actually fun and engagingI think many potential data-viz programmers assume there is a bigdivide, called Web Development, between doing what they would liketo do, which is program in Python and Java Script. Web-dev involvesloads of arcane knowledge about markup-languages, style-scriptsadministration etc and cant be done without tools with strangenames like Gulp or Yeoman. I aim to show that these days that bigdivide can be collapsed to a thin and very permeable membrane,allowing you to focus on what you do well, programming stuff(seeFigure P-1) with minimal effort, relegating the web-servers to data-delivery.PerceptionRealitere Bel、 SlandlandWEbdevPyland rwagons( Pulandx想Figure p-1. Here be web-dev dragonsWho this book is forFirst off, this book is for anyone with a reasonable grasp of pythonor JavaScript who wants to explore one of the most exciting areas inthe data-processing ecosystem right now, the exploding field ofdata-visualisation for the web. It's also about addressing some specific pain-points which in my experience are quite commonIntroductionWhen you get commissioned to write a technical book, chances areyour editor will sensibly caution you to think in terms of painpoints that your book aims to address. The two key pain points ofthis book are best illustrated by way of a couple of stories, one myown, the other one that has been told to me in various guises by Jav-aScripters I knowMany years ago, as an academic researcher, i came across pythonand fell in love. I had been writing some fairly complex simulationsin C(++)and Pythons simplicity and power was a breathe of freshair from all the boilerplate, makefiles, declarations and definitionsand the like. Programming was fun, Python the perfect glue, playingnicely with my C(++) libraries(Python wasn't then and still isnt aspeed demon)and doing, with consummate ease, all the stuff that inlow level languages is such a pain, e.g. file 1/O, database access, serialisation etc.. I started to write all my graphical user interfaces(GUIs)and visualisations in Python, using wxPython, PyQt and awhole load of other refreshingly easy toolsets. Now there's some stuffthere that i think is pretty cool but i doubt i'll ever get around to thenecessary packaging, version checking and various other hurdles todistribution so no-one else will ever see itAt the time there existed what in theory was the perfect universaldistribution system for the software Id so lovingly crafted, namelythe web-browser. Available on pretty much every computer onearth, with its own built-in, interpreted programming language,write once, run everywhere. But everyone knew that a. Pythondoesnnt play in the web-browsers sandpit and b browsers were incappable of ambitious graphics and visualisations, being pretty muchlimited to static images and the odd jQuery transformation. JavaScript was a toy' language tied to a very slow interpreter good forlittle DOM tricks but certainly nothing approaching what I could doon the desktop with Python. So that route was discounted, out ofhand. My visualisations wanted to be on the web but there was noroute througnFast forward a decade or so and, thanks to an arms race initiated byGoogle and their v8 engine, JavaScript is now orders of magnitudefaster, in fact it's now an awful lot faster than Python. HTML hasalso tidied up its act a bit, in the guise of hTml5. It's a lot nicer toSee here for a fairly jaw-dropping comparisonIntroductionwork with, with much less boilerplate. What were loosely followedand distinctly shaky protocols like Scalable Vector Graphics (SVG)have firmed up nicely thanks to powerful visualisation libraries, D3being preeminent. Modern browsers are obliged to work nicely withSVG and, increasingly, 3D in the form of WebGL and its childrensuch as THREE. js. Those visualisations I was doing in Python arenow possible on your local web-browser and the payoff is that, withvery little effort, they can be made accessible to every desktop, lajtop, smartphone and tablet in the worldSo why aren't Pythonistas flocking to get their data out there in aform they dictate? After all, the alternative to crafting it yourself isleaving it to somebody else, something most data-scientists I knowwould find far from ideal. Well, first theres that term Web development, connoting complicated markup, opaque stylesheets, a wholeslew of new tools to learn, IDEs to master. And then theres javaScript itself, a strange language, thought of as little more than a toyuntil recently and having something of the neither fish nor fowl toit I aim to take those pain-points head-on and show that you cancraft modern web-Visualisations(often single page apps)with a veryminimal amount of HTML and Css boilerplate, allowing you tofocus on the programming, and that JavaScript is an easy leap forthe Pythonista, having a lot in common. But you dont have to leap,Chapter 2 is a language-bridge, which aims to help Pythonistas andJavaScripters bridge the divide between the languages by highlight-ing common elements and providing simple translationsThe second story is a common one I run into among JavaScriptdata-visualiers i know. Processing data in JavaScript is far fromideal. There are few heavyweight libraries and although recent func-tional enhancements to the language make data-munging muchmore pleasant, theres still no real data-processing ecosystem tospeak of. So there's a distinct asymmetry between the hugely powerful visualisation libraries available, D3 as ever paramount, and theability to clean and process any data delivered to the browser. all ofthis mandates doing your data-cleaning, processing and explorationin another language or with a toolkit like Tableau and this oftendevolves into piecemeal forays into vaguely remembered Matlab, thesteepish learning curve that is R or a Java library or twoToolkit's like Tableau, although very impressive, are often, in myexperience, ultimately frustrating for programmers. Theres no wayto replicate in a GUI the expressive power of a good, general pur-Introduction
用户评论
是early release版本