This is a c# project that will scrape information from various different pipeline meter listings, transform the returned data into a generic model, and insert into database. This implementation is assuming a postgresql database.
The following setup is required to run this program locally:
Each job requires the following environment variables to be set. These variables would either go in a yaml file if you are running this in a docker container for a cronjob or a .env file for running locally.
| Variable Name | Description |
|---|---|
DB_CONNECTION |
Base connection string for the database |
DB_USERNAME |
The username for authenticating into the database |
DB_PASSWORD |
The password for authenticating into the database |
CHROME_PATH |
The path to the chrome app/exe. IN chrome go to chrome://version to find the Executable Path |
| Variable Name | Description |
|---|---|
JOB_TYPE |
The type of job to run. This points to a specific runner |
JOB_RESOURCE_URL |
The URL of the page where a download button can be clicked |
JOB_DOWNLOADED_NAME |
The filename with extension (.csv) |
JOB_RETRIES |
The number of times you want to try the job before failing |
JOB_RETRY_INTERVAL |
The amount of time in seconds to wait in between each attempt |
Each site with a unique UI will require a runner to be created. See KinderMorganRunner for an example.
Each runner will require it's own model with annotations for the CSV parser. See KinderMorganRaw.cs. You can name the properties of the model however you want, but you will need to annotate them.
| Annotation | Example | Description |
|---|---|---|
| Name | [Name("NAME_OF_COLUMN_IN_CSV")] |
This tells the CSV parser which column in the data to map to the property in the class |
| Ignore | [Ignore] |
This tells the CSV parser to ignore the property when mapping the data to the class |
This model probably isn't fully stubbed out, I just did the properties I thought were important, feel free to add more if you need them. Within this class, there should be a method for each runner model that takes an item of that type and fills out the TableItem class with the values from that type. See TableItem.cs for an example.
Once a runner and model have been created, you need to add the runner to the dependency injection setup in the ConfigureServices section of Program.cs so it can be used.
services.AddKeyedScoped<IRunner, YOUR_RUNNER>("YOUR_RUNNER_NAME");Tip
The name of the service supplied in the dependency injection is the value that will be used for the JOB_TYPE environment variable
For example, the KinderMorganRunner name is "kinder-morgan".