We will be showing how to import database data from Edgar, with SQL code as well as SSIS Import Wizard configuration settings.
“Since 1934, the SEC has required disclosure in forms and documents. In 1984, EDGAR began collecting electronic documents to help investors get information”
First import the latest master.idx from http://www.sec.gov/edgar/indices/fullindex.htm and edit the file as shown below.
Create a database called Edgar
Create a table called master1 using T-SQL code below
Right click on Edgar database,click Tasks, then Import dat and use settings below.
USE [Edgar]
GO
/****** Object: Table [dbo].[master_02102013] Script Date: 2/10/2013 1:41:59 AM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[master_02102013](
[MasterID] [int] IDENTITY(1,1) NOT NULL,
[CIK] [varchar](50) NULL,
[Company Name] [varchar](200) NULL,
[Form Type] [varchar](50) NULL,
[Date Filed] [varchar](50) NULL,
[Filename] [varchar](50) NULL,
CONSTRAINT [PK_master_02102013] PRIMARY KEY CLUSTERED
(
[MasterID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
Make sure Identity insert is checked off. Also make sure you select delete existing rows NOT append row data.
Example Search T-SQL Query after import is completed.
SELECT TOP 100000 [MasterID]
,[CIK]
,[Company Name]
,[Form Type]
,[Date Filed]
,[Filename]
FROM [Edgar].[dbo].[master]
Result
Coming soon – a script, to extract ftp filepaths from above results and possibly a script [in python] tp download from ftp.sec.gov all the files into appropriate columns in our database, which currently houses 53,398 rows of data.
————–
Scriot for downloading ONE file, from say data in first row of table created above. Save this as test.scr
anonymous
<your email address goes in here>
cd edgar/data/1000045/
ls *.txt
get 0001193125-13-046001.txt
quit
Script for a batchfile to execute above. Save as, for example: gosec.bat
cls
ftp -s:test.scr ftp.sec.gov