We will be showing how to import database data from Edgar, with SQL code as well as SSIS Import Wizard configuration settings.
“Since 1934, the SEC has required disclosure in forms and documents. In 1984, EDGAR began collecting electronic documents to help investors get information”
First import the latest master.idx from http://www.sec.gov/edgar/indices/fullindex.htm and edit the file as shown below.
Create a database called Edgar
Create a table called master1 using T-SQL code below
Right click on Edgar database,click Tasks, then Import dat and use settings below.
/****** Object: Table [dbo].[master_02102013] Script Date: 2/10/2013 1:41:59 AM ******/
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
SET ANSI_PADDING ON
CREATE TABLE [dbo].[master_02102013](
[MasterID] [int] IDENTITY(1,1) NOT NULL,
[CIK] [varchar](50) NULL,
[Company Name] [varchar](200) NULL,
[Form Type] [varchar](50) NULL,
[Date Filed] [varchar](50) NULL,
[Filename] [varchar](50) NULL,
CONSTRAINT [PK_master_02102013] PRIMARY KEY CLUSTERED
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
SET ANSI_PADDING OFF
Make sure Identity insert is checked off. Also make sure you select delete existing rows NOT append row data.
Example Search T-SQL Query after import is completed.
SELECT TOP 100000 [MasterID]
Coming soon – a script, to extract ftp filepaths from above results and possibly a script [in python] tp download from ftp.sec.gov all the files into appropriate columns in our database, which currently houses 53,398 rows of data.
Scriot for downloading ONE file, from say data in first row of table created above. Save this as test.scr
<your email address goes in here>
Script for a batchfile to execute above. Save as, for example: gosec.bat
ftp -s:test.scr ftp.sec.gov