background preloader

Tabula: Extract Tables from PDFs

Tabula: Extract Tables from PDFs

Sculpting text with regex, grep, sed and awk Theory: Regular languages Many tools for searching and sculpting text rely on a pattern language known as regular expressions. The theory of regular languages underpins regular expressions. (Caveat: Some modern "regular" expression systems can describe irregular languages, which is why the term "regex" is preferred for these systems.) Regular languages are a class of formal language equivalent in power to those recognized by deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs). [See my post on converting regular expressions to NFAs.] In formal language theory, a language is a set of strings. For example, {"foo"} and {"foo", "foobar"} are formal (if small) languages. (Mathematicians don't typically put quotes around a string, preferring to let the fixed-width typewriter font distinguish it as one, but I'm guessing that programmers are more comfortable with the quotes around strings.) In regular language theory, there are two atomic languages: Useful grep flags The +? #!

Mathematica: L'outil de Calcul Technique le Plus Abouti With energetic development and consistent vision for three decades, Mathematica stands alone in a huge range of dimensions, unique in its support for today's technical computing environments and workflows. A Vast System, All Integrated Mathematica has over 6,000 built-in functions covering all areas of technical computing—all carefully integrated so they work perfectly together, and all included in the fully integrated Mathematica system. Not Just Numbers, Not Just Math—But Everything Building on three decades of development, Mathematica excels across all areas of technical computing—including neural networks, machine learning, image processing, geometry, data science, visualizations and much more. Unimaginable Algorithm Power Mathematica builds in unprecedentedly powerful algorithms across all areas—many of them created at Wolfram using unique development methodologies and the unique capabilities of the Wolfram Language. Higher Level Than Ever Before Superfunctions, meta-algorithms...

OCR gratuit en ligne - convertir PDF en Word ou Image en texte Vintage and Modern Free Public Domain Images Archive Download - Public Domain Images | Free Stock Photos Public preview of project codename “GeoFlow” for Excel delivers 3D data visualization and storytelling Editor’s note: Since this article was originally published in April 2013, Project codename “GeoFlow” has been renamed Power Map as part of the new Power BI for Office 365 offering. Today we are announcing the availability of the project codename “GeoFlow” Preview for Excel 2013, a result of collaborations between several teams within Microsoft. GeoFlow lets you plot geographic and temporal data visually, analyze that data in 3D, and create interactive “tours” to share with others. GeoFlow originated in Microsoft Research, evolving out of the successful WorldWide Telescope project for scientific and academic communities to explore large volumes of astronomical and geological data. With GeoFlow, you can: Map Data: Plot more than one million rows of data from an Excel workbook, including the Excel Data Model or PowerPivot, in 3D on Bing maps. Unlocking insights within geospatial data like ticket sales is now possible with GeoFlow. Find out more about Microsoft BI.

OpenRefine Digital tools for researchers | Connected Researchers Find out how digital tools can help you: Explore the literature(back to top) Here is a collection of digital tools that are designed to help researchers explore the millions of research articles available to this date. Search engines and curators help you to quickly find the articles you are interested in and stay up to date with the literature. Article visualization tools enhance your reading experience, for instance, by helping you navigate from a paper to another. Search engines and curators Article visualization Find and share data and code(back to top) Managing large sets of data and programing code is already unavoidable for most researchers. Connect with others(back to top) Research cannot stay buried in the lab anymore! Connect with experts and researchers Outreach Citizen science Crowdfunding At the bench and in the office(back to top) Here is a collection of tools that help researchers in their everyday tasks. Lab and project management Electronic lab notebook Outsourcing experiments

Four steps to analyzing big data with Spark By Andy Konwinski, Ion Stoica, and Matei Zaharia In the UC Berkeley AMPLab, we have embarked on a six year project to build a powerful next generation big data analytics platform: the Berkeley Data Analytics Stack (BDAS). We have already released several components of BDAS including Spark, a fast distributed in-memory analytics engine, and in February we ran a sold out tutorial at the Strata conference in Santa Clara teaching attendees how to use Spark and other components of the BDAS stack. In this blog post we will walk through four steps to getting hands-on using Spark to analyze real data. What makes Spark so fast? Follow these four steps and see for yourself how easy it is to get up and running with Spark: Familiarize yourself with the Spark project. Our vision is to build the next generation stack of open-source data analytics software, and the Spark cluster computing framework is a major step towards that vision. Related:

WebPlotDigitizer - Copyright 2010-2017 Ankit Rohatgi Loading application, please wait... Problems loading? Make sure you have a recent version of Google Chrome, Firefox, Safari or Internet Explorer 11 installed. Magnified View Settings X and Y Axes Calibration Enter X-values of the two points clicked on X-axis and Y-values of the two points clicked on Y-axes Bar Chart Calibration Enter the values at the two points selected on the continuous axes along the bars Select Range of Variables Axes Orientation Range of Variables Align X-Y Axes Click four known points on the axes in the order shown in red. Align Bar Chart Axes Click on two known points (P1, P2) on the continuous axes along the bars Align Map To Scale Bar Click on the two ends of the scale bar on the map. Align Polar Axes Click on the center, followed by two known points. Align Ternary Axes Click on the three corners in the order shown above. Transformation Equations The following relationships are being used to convert image pixels to data: Export JSON Import JSON JSON File: Keyboard Shortcuts Processing

libre-innovation.org Spreadsheet converts tweets for social network analysis in Gephi EDIT 05/15/13: I’ve posted two scripts, one in PHP and one in Python, that overcome the main limitation of this spreadsheet–they pull in all mentioned names rather than just the first one. Download one or both here. If you’ve ever wanted to visualize Twitter networks but weren’t sure how to get the tweets into the right format, this spreadsheet I’ve been using in my classes might be worth a try. Download the file and open it locally in Excel or OpenOffice to add your own data (right now it uses some of my recent tweets as example data). Add the username(s) of your tweet author(s) to column A of the “code lives here” worksheet.Add your author(s)’ tweets to column B.Copy columns C through H as far down as your tweets go.Export the “output lives here” worksheet as a CSV and open it in Gephi (you may need to copy the formulae in columns A and B as far down as your data go). Here is a network graph of the example data.

GEOFLA® GEOFLA® Communes France métropolitaine et DOM : format shapefile, projection Lambert-93 en métropole et UTM outre-mer Edition 2016 - Version 2.2 : Télécharger GEOFLA® 2016 v2.2 Communes France Métropolitaine (7z de 19,6 Mo) Télécharger GEOFLA® 2016 v2.2 Communes Guadeloupe(7z de 1,8 Mo) Télécharger GEOFLA® 2016 v2.2 Communes Martinique (7z de 1,8 Mo) Télécharger GEOFLA® 2016 v2.2 Communes Guyane (7z de 1,8 Mo) Télécharger GEOFLA® 2016 v2.2 Communes Réunion (7z de 1,8 Mo) Télécharger GEOFLA® 2016 v2.2 Communes Mayotte (7z de 1,8 Mo) Edition 2015 - Version 2.1 : Télécharger GEOFLA® 2015 v2.1 Communes France Métropolitaine (7z de 19,6 Mo) Télécharger GEOFLA® 2015 v2.1 Communes Guadeloupe(7z de 1,8 Mo) Télécharger GEOFLA® 2015 v2.1 Communes Martinique (7z de 1,8 Mo) Télécharger GEOFLA® 2015 v2.1 Communes Guyane (7z de 1,8 Mo) Télécharger GEOFLA® 2015 v2.1 Communes Réunion (7z de 1,8 Mo) Télécharger GEOFLA® 2015 v2.1 Communes Mayotte (7z de 1,8 Mo) Edition 2015 - Version 2.0 : Edition 2014 - Version 2.0 :

38 Tools For Beautiful Data Visualisations | DBi UK As we enter the Big Data era, it becomes more important to properly expand our capacity to process information for analysis and communication purposes. In a business context, this is evident as good visualisation techniques can support statistical treatment of data, or even become an analysis technique. But also, can be used as a communication tool to report insights that inform decisions. Today there are plenty of tools out there that can be used to improve your data visualisation efforts at every level. Below we list a non-exhaustive list of resources. Javascript Libraries Circular Hierarchy – D3.js Python Libraries Kartograph.py – Mapsigraph – Node-link, treesMatplotlib – Most types of statistical plotsPycha – Pie chart, bar chart, area chartNetworkX – Node-link Java / PHP Web Applications TileMill – Running Map Programming Languages Hyperbolic Tree – NYTimes 365/360 Generated with Processing Desktop Applications Treemap – Tableau

Related: