Proc Import Read Txt File With No Column Name

A jargon-complimentary, like shooting fish in a barrel-to-learn SAS base of operations course that is tailor-made for students with no prior knowledge of SAS.

How to Import Text Files into SAS

Text files are a common file format to employ when importing or exporting data from i information source for another. When importing text files from other data sources or databases, there are many variations in the data structure and delimiters that one tin come up beyond.

 This commodity aims to address some of the more mutual challenges that arise when attempting to import different variations of text files into SAS. A few different tips and methods are too provided forth the way.

 Topics covered include:

  1. Importing tab-delimited text files with PROC IMPORT
  2. Importing special character delimited text files with PROC IMPORT
  3. Importing space-delimited text files with PROC IMPORT
  4. Using PROC IMPORT to Generating Information Step code for importing text files

Software

Earlier we go along, make sure you accept admission to SAS Studio. Information technology's free!

Information Sets

The examples used in this article are based on the text files listed below. The text files are derived from the SASHELP datasets including CARS and ORSALES datasets:

  1. Cars_tab.txt - download
  2. Cars_pipe.txt - download
  3. Orsales_space.txt - download

Earlier running whatever of the examples below, y'all will need to replace the path '/home/your_username/SASCrunch' with a directory that you accept read/write admission to in your environment.

This tin can vary depending on the car you are running SAS on, the version of SAS you are running and the Operating System (Os) yous are using.

If yous are using SAS OnDemand for Academics, y'all must first upload the files to the SAS server.

Please visit here for instruction if y'all are not certain how to do this.

1. Importing a Tab-delimited Text File with PROC IMPORT

With a tab-delimited text file, the variables (columns) are separated past a tab and the files typically end with a ".txt" extension.

In this example, the input file is the cars_tab.txt file. This is a text file based on the SASHELP.CARS dataset.

The kickoff part you demand following the PROC IMPORT statement is the datafile  statement. The datafile argument is required so that SAS knows where the file you would similar to import is stored and what the name of that file is. Within the quotation marks following the datafile argument, you need to add together the complete path, including the filename and file extension. Be certain to replace '/domicile/your_username/SASCrunch' with the right directory on your auto or environment where cars_tab.txt is saved. In this instance, "/habitation/your_username/SASCrunch" is the path, "cars_tab" is the filename, and ".txt" is the file extension.

To import tab-delimited text files, both the DBMS and DELIMITER options will need to be used. The DBMS value used for this example is DLM. The DLM value tells SAS that you would like to specify a custom delimiter for the dataset.

After closing off the PROC IMPORT statement with a semi-colon, a 2d option, DELIMITER is added. The value of DELIMITER for a tab-delimited file is '09'x, which is the hexadecimal representation of a TAB on an ASCII platform.

Finally, the supplant choice is included to let for multiple re-runs and overwrites of the CARS_TAB dataset in Work. If y'all prefer not to overwrite the newly imported SAS dataset, you can merely remove the replace selection.

Using these parameters, the following lawmaking will import the tab-delimited cars_tab.txt file and output a SAS dataset in WORK called CARS_TAB:

proc import datafile  = '/dwelling/your_username/SASCrunch/cars_tab.txt'
 out = cars_tab
dbms  = dlm
 supersede;
delimiter = '09'x;
run;

After running the above code, y'all will notice something is a bit off with the output dataset:

If you were to open the cars_tab.txt file straight using Notepad, Wordpad, TextEdit or like on your computer, you would notice that this file has a extra row of invalid information in it. This type of situation often occurs when the text file is created from another information source.

 Fortunately, SAS provides an selection that you can add together to your PROC IMPORT argument to skip this extra line of information that you don't demand. By adding the datarow selection, you tin can let SAS know at which row the information (observations) start. In this case, nosotros know that the first row has the headings, the second row has no information, and the observations start on the tertiary row, so nosotros set up datarow = iii :

proc import datafile = '/home/your_username/SASCrunch/cars_tab.txt'
 out = cars_tab
 dbms = dlm
 replace
 ;
 delimiter = '09'x;
datarow = 3;
run;

In the output data shown partially below, you will see that extra row has at present been removed:

Do yous have a hard time learning SAS?

Take our Practical SAS Training Course for Absolute Beginners and learn how to write your first SAS program!

two. Importing Text Files Delimited with Special Characters

Since text files can contain whatever number of special characters equally delimiters, the DELIMITER statement be used with only about any keyboard character.

 For example, if the values of a text file are delimited with the pipe bar "|", you can simply specify the piping bar symbol in the DELIMITER  statement, similar to how we used '09'10 for tab-delimited files. In this example, the cars_pipe.txt file is read in to create the CARS_PIPE SAS dataset in the WORK library:

proc import datafile = '/dwelling house/your_username/SASCrunch/cars_pipe.txt'
 out = cars_pipe
 dbms = dlm
 replace
 ;
delimiter = '|';
run;

Afterwards updating the path in the datafile statement and running the above lawmaking, y'all will notice that while the columns accept been read in correctly, the variable names are not right and actual values are existence used as the variable names:

If you lot were to open up the cars_pipe.txt file directly using Notepad, Wordpad, TextEdit or like text editors on your computer, yous would notice that this text file has no cavalcade headings and the data starts directly in the first row.

 To get effectually this, y'all demand to permit SAS know that there are no cavalcade headings provided in the input text file. By default, in that location is a GETNAMES option in PROC IMPORT which is set to Yes. With this setting equal to YES,  SAS assumes that the first row of data contains the cavalcade headings, which ultimately end up as the SAS variable names. When this is not the case, merely set up GETNAMES = NO to let SAS know there are no column headings provided in the input file:

proc import datafile = '/home/your_username/SASCrunch/cars_pipe.txt'
 out = cars_pipe
 dbms = dlm
 replace
 ;
 getnames = no;
 delimiter = '|';
run;

Now in the output data, all the records will exist found in the dataset itself, but the heading names will have generic names from VAR1 up to VAR15 in this case, since there are xv columns:

To fix the variable names, y'all could for example use the SAS Information Step with the RENAME statement to create a new dataset. As an instance, the following dataset code would create a dataset chosen CARS_PIPE_CLEAN, which uses the RANEM argument to set the appropriate variable names as shown hither:

data cars_pipe_clean;
 gear up cars_pipe;
 rename           var1 = brand
                          var2 = model
                          var3 = type
                          var4 = origin
                        /*var5 = ...
                          var15 = ...*/
                             ;
run;

3. Importing Infinite-delimited Text Files with PROC IMPORT

Infinite-delimited text files are yet another common file type you may run into that you would like to import into SAS. By default, setting DBMS = DLM with your PROC IMPORT statement will use space as the delimiter, so y'all don't demand to explicitly use the delimiter pick in this instance.

 For example, the orsales_space.txt text file contains space-delimited columns, and can exist imported into SAS with DBMS = DLM :

proc import datafile = '/home/your_username/SASCrunch/orsales_space.txt'
 out = orsales
dbms = dlm
 supercede
 ;
run;

At first glance, it appears that the import was successful and the ORSALES dataset was successfully created in WORK as shown partially here:

However, if you run a PROC FREQ (lawmaking provide below) on the Product_Line variable, yous will discover that one of the values for Product_Category is truncated:

proc freq data = orsales;
 tables product_category;
run;

As shown in the Results, "Assorted Sports Articles" is now only "Assorted Sports A" in this newly imported dataset:

This type of situation tin often occur when importing datasets into SAS because PROC IMPORT will only check a portion of the records before determining what the appropriate variable blazon and lengths should be on the output SAS dataset.

 The solution to this problem is to include the GUESSINGROWS option with your PROC IMPORT call. Past specifying a number for GUESSINGROWS, you tin can tell SAS how many rows it should scan in your incoming dataset before determining what the advisable length and variable types should exist.

 In this example import, in that location are 912 rows of data. Here, by setting GUESSINGROWS = 912  we tin can be sure that SAS will selection the largest width necessary to avoid truncation of any data when it completes the import. A new dataset, ORSALES_GUESSINGROWS, is so created so you can see the difference in results:

proc import datafile = '/home/your_username/SASCrunch/orsales_space.txt'
 out = orsales_guessingrows
 dbms = dlm
 supervene upon
 ;
 guessingrows = 912;
run;

Past running a PROC FREQ to generate a frequency table on the newly created dataset, we can test whether or not the GUESSINGROWS pick was effective:

proc freq data = orsales_guessingrows;
 tables product_category;
run;

Equally you can see from the output, the Product_Category value "Contrasted Sports Articles" now shows upward correctly and is no longer truncated:

It's of import to annotation that GUESSINGROWS can be extremely computationally intensive and may significantly irksome down the fourth dimension it takes to import your dataset to SAS. The larger the value you fix for GUESSINGROWS, the longer the processing will take, but more reliable the results will be. The run fourth dimension will of form depend on your environment, the number of records and the number of variables found in your information.

iv. Importing a Tab-delimited File using Data Step

Although the amount of SAS code required to import a Text file using Information Pace is longer than the code required for PROC IMPORT, using Information Stride lawmaking allows for greater flexibility.

By using Data Footstep code, the variable names, lengths and types tin can be manually specified at the fourth dimension of import. The advantage is that this allows you to format the dataset exactly the way you want as soon as information technology is created in SAS, rather than having to make additional modifications later on on.

Starting time, as with whatsoever SAS Data Step lawmaking, you need to specify the proper noun and location for the dataset you are going to create. Here, a dataset named CARS_DATASTEP will be created in the WORK directory.

The next footstep is to utilise the INFILE statement. The INFILE statement in this instance is fabricated upward of 6 components:

  1. The location of the Text file –  /home/your_username/SASCrunch in this example
  2. Delimiter option – the delimiter found on the input file enclosed in quotation marks (delimiter is '09'x in this case since it is a tab-delimited file)
  3. MISSOVER option – Tells SAS to proceed reading the aforementioned record even if a missing value is found for ane of the variables
  4. FIRSTOBS – The start row that contains the observations in the input file (Set to 3 in this case since the observations commencement on the third row in the cars_tab.txt file)
  5. DSD – Tells SAS that when a delimiter is found within a quotation mark in the dataset, it should be treated as a value and non a delimiter
  6. LRECL – Maximum length for an entire record (32767 is the default maximum to use which volition ensure no truncation within 32767 characters)

After the INFILE statement, the simplest way to ensure that your variable names, lengths, types and formats are specified correctly is to utilise a format statement for each variable. After an advisable format has been assigned to each variable, the variables that yous would like to import should be listed in order after an INPUT statement. Note that character variables should have a dollar sign ($) after each variable proper noun.

Note that y'all can also specify INFORMATs and LENGTHs optionally here, but in almost cases the FORMAT and INPUT statements should be all you need for a successful import.

Beneath is the Information Step code that would successfully import the cars_tab.txt file into a SAS dataset. As mentioned, be sure to update the path to the correct location of the cars_tab.txt file in your surround before running the following code:

data work.cars_datastep_tab;
infile  '/home/your_username/SASCrunch/cars_tab.txt'
delimiter ='09'x
      missover
firstobs =two
DSD
 lrecl  = 32767;

        format Make $5. ;
        format Model $30. ;
        format Type $half dozen. ;
        format Origin $6. ;
        format DriveTrain $v. ;
        format MSRP $9. ;
        format Invoice $nine. ;
        format EngineSize best12. ;
        format Cylinders best12. ;
        format Horsepower best12. ;
        format MPG_City best12. ;
        format MPG_Highway best12. ;
        format Weight best12. ;
        format Wheelbase best12. ;
        format Length best12. ;
input
                 Make $
                 Model $
                 Type $
                 Origin $
                 DriveTrain $
                 MSRP $
                 Invoice $
                 EngineSize
                 Cylinders
                 Horsepower
                 MPG_City
                 MPG_Highway
                 Weight
                 Wheelbase
                 Length
     ;
 run;

After running the higher up lawmaking, you should encounter the CARS_DATASTEP_TAB data set, shown partially hither:

Go a Certified SAS Specialist

Get admission to two SAS base of operations certification prep courses and 150+ practice exercises

5. Generating Data Step Code with PROC IMPORT

When the variable names, types, lengths or formats that SAS is automatically generating with PROC IMPORT are non what you are looking for, and you don't want to blazon out xl+ lines of lawmaking every bit in the previous example, PROC IMPORT can yet exist a fourth dimension-saving tool.

 Going back to the cars_pipe.txt text file, call up that this text file did not incorporate column headings.

 Re-run the following lawmaking to import cars_pipe.txt into SAS and create a temporary dataset, CARS_PIPE to be stored in WORK:

proc import datafile = '/home/your_username/SASCrunch/cars_pipe.txt'
 out = cars_pipe
 dbms = dlm
 replace
 ;
 getnames = no;
 delimiter = '|';
run;

Later running the in a higher place code, go to the Log that is created and notice that SAS Data Step code is actually being generated as a result of the PROC IMPORT:

By merely copying and pasting this code from your log into your SAS program, you can now employ this code every bit a template to showtime your Data Step code, modifying it every bit needed to suit variable names, types and lengths.

 For example, you can replace the variable names VAR1-VAR15 with the original variable names from CARS, equally shown here:

data Piece of work.CARS_PIPE_CUSTOM    ;
infile '/habitation/your_username/SASCrunch/cars_pipe.txt' delimiter  =  '|' MISSOVER DSD lrecl = 32767;
     informat make $v. ;
     informat model $xxx. ;
     informat blazon $6. ;
     informat origin $half dozen. ;
     informat drivetrain $5. ;
     informat msrp nlnum32. ;
     informat invoice nlnum32. ;
     informat enginesize best32. ;
     informat cylinders best32. ;
     informat horsepower best32. ;
     informat mpg_city best32. ;
     informat mpg_highway best32. ;
     informat weight best32. ;
     informat wheelbase best32. ;
     informat length best32. ;
     format brand $5. ;
     format model $30. ;
     format blazon $6. ;
     format origin $6. ;
     format drivetrain $5. ;
     format msrp nlnum12. ;
     format invoice nlnum12. ;
     format enginesize best12. ;
     format cylinders best12. ;
     format horsepower best12. ;
     format mpg_city best12. ;
     format mpg_highway best12. ;
     format weight best12. ;
     format wheelbase best12. ;
     format length best12. ;
  input
              make $
              model $
              type $
              origin $
              drivetrain $
              msrp
              invoice
              enginesize
              cylinders
              horsepower
              mpg_city
              mpg_highway
              weight
              wheelbase
              length
  ;
run;

Later running the above code, a new dataset WORK.CARS_PIPE_CUSTOM is created by importing the cars_pipe.txt text file using the SAS Data Step code we generated using PROC IMPORT.

Master SAS in thirty Days

Inline Feedbacks

View all comments

iconmail

Get latest articles from SASCrunch

SAS Base Certification Exam Prep Form

Two Certificate Prep Courses and 300+ Exercise Exercises

wisenals1951.blogspot.com

Source: https://sascrunch.com/importing-text-files/

0 Response to "Proc Import Read Txt File With No Column Name"

Enregistrer un commentaire

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel