Andy Leone's Perl Page |
||||
Below are some programs for, among other things, downloading and extracting data from Edgar using Perl. If you are new to Perl, there are plenty of resources on teh Web to get you started. I have tried to document that programs so that you can modify them based on your particular needs. In addition, we will be teaching a course at the University of Miami from June 1-June 5. This course is intended for Ph.D. students and faculty to get up to speed with Perl and Edgar. Click here for details. Before you get started, there is software that you should install on your computer. Click here to get instructions and a list of the software you need. |
||||
Example/Sample Programs |
Other Resources |
|||
| Programs for downloading and analyzing data from Edgar | 1. Download Index files (get_index_files.pl) 2.Download Fillings (Download_Filings.pl) 4. Extract information from filings. (e.g., audit opinion). File: read_find_extract_blocks_of_text.pl
|
|
||
| MySQL | Using the DBI Module to create a header record (using_DBI_mysql.pl). Using the DBI to write audit opinion to database (read_and_write_to_mysql.pl). |
Some Notes on MySQL |
||
| SAS and Regex | RegExBasics (RegEx_Basics.SAS) |
|||
| SQL Primer | SAS / SQL Examples | |||
| Sas Macros |
Compute CARS (sas_class10.sas) |
|||
| Extracting Data from PDF files | Example program - this program reads analyst reports from Investext. This program uses a third-party module that you have to purchase. It is called PDFlib Text Extraction Toolkit (TET). Also, the module currently only runs under Perl 5.8 and won't work with 5.10. | |||
|
Quick Reference for Regex
|
||||