Image Processing Core AWS Step Function
This step function image-step-function is the core step function that controls the process from raw imagery to finished orthomosaics, dense cloud (not exported atm) and DEMs. Several lambdas act on converting the raw data to a useful format and populating databases. The step function is called from a dynamo fanout lambda, which is triggered on a new item insertion: DynamoStreamFanoutFunction. The step function can be called from an event. However, the number of parameters used for the step function is large and cumbersome for an operator. Step function invocation or 'job creation' is done from the Admin Interface. Jobs are items saved in DynamoDB Job Table, these control all the necessary parameters used for processing. They do not only control this specific step function, but parameters are also used and updates by later processes in Postprocessing. Postprocessing, therefore, always requires a job item (session) to exist.
Step Functions
The step functions that are used within this step function:
- A ImageStartECS step function that starts an EC2 within the ECS cluster based on set parameters. Contains fall back procedures to account for multiregion options.
- A core-ml-orchestration step function that orchestrates the machine learning process flow.
Lambdas
The lambdas orchestrated by the Step function are the following:
- A ImageEtlIdentifierFunction to identify any ETL procedures necessary to standardize data to our requirements.
- A UtilityBatcherFunction that seperates a list into a list of lists of batch size. Used in a step function to prepare for a mapper.
- A ImageEtlActionFunction to act on the ETL identification and standardize data.
- A ImageGetFilesFunction wraps around the list s3 boto3 library to collect all the raw image files and send the payload jsons to s3.
- A ImageExtractExifFunction to collect all the metadata required for further processing. It can decide between .shp metadata, .txt metadata or internally written exif data.
- A ImageCoordinatesFunction takes the image coordinates, creates polygons out of it and populates the imageprocessing-db cameras table.
- A ImageWeatherFunction.
Warning decoupled
- A ImageTilerFunction cuts the RGB output into defined tiles, default is 512x512 pixels.
Invocation description
The following keys are required:
flights: list(object), {
dem: str, path to reference DEM
prefix: str, prefix to raw image folder
flight: object: {
date: str, string representation of date
number: int, flight identifier
code: str, generated flight code
id: int, unique flight id
}
order_id: int, order ID
dem_size: float, filesize of DEM
}
job_id: str, unique job session ID
crops: list(object), {
name: str, crop name,
id: int, unique crop ID
}
oi_rds_info: str, string representation of RDS connection object string
client_tag: str, client tag to determine costing
gpu: int, number of GPUs you want to process with, choices are [1, 4, 8] defaults to 8
restart: bool, whether process needs to restart or initially start. This wipes all present data.
volume_size: int, size of attached EBS volume in GB
workteam_arn: str, ARN of sagemaker groundtruth team
preprocess: bool, switch to run or skip preprocessing, good to run this in conjuction with restart=false, default is true
Example
{
"flights": [
{
"dem": "12_ghana/ghana_dem_srtm.tif",
"prefix": "04_raw/20200702_01",
"flight": {
"date": "2020-07-02",
"number": 1,
"code": "20200702_01",
"id": 533
},
"order_id": 1336,
"dem_size": 78.8400354385376
},
{
"dem": "12_ghana/ghana_dem_srtm.tif",
"prefix": "04_raw/20200702_02",
"flight": {
"date": "2020-07-02",
"number": 2,
"code": "20200702_02",
"id": 534
},
"order_id": 1336,
"dem_size": 78.8400354385376
}
],
"job_id": "20220628085757-1336-0793d298f70e44eabd04ac054d52777c",
"crops": [
{
"name": "citrus",
"id": 12
}
],
"gpu": 4,
"oi_rds_info": "{\"PGHost\": \"cppvhxmn7rf4dr.chh12pbne5u7.eu-west-1.rds.amazonaws.com\", \"PGPort\": \"5432\", \"PGDatabase\": \"eaglesensing\"}",
"client_tag": "Arkadiah Technology",
"restart": false,
"workteam_arn": "arn:aws:sagemaker:eu-west-1:427792949008:workteam/private-crowd/private-labelling-team",
"volume_size": 1500,
"preprocess": false
}
Troubleshooting
Lookup table
| Reference | Content |
|---|---|
| license activation | floating-license-activation |
| license activation offline | offline-activation |
| license directories | /opt/agisoft-rlm-server |
| license secret | arn:aws:secretsmanager:eu-west-1:370748765081:secret:ImageProcessingLicense-prod-Z4GwtR |
| license server webpage | localhost |
| ppu activation | /core/ec2/image_metashape/license_hack.py |
| rlm_tmp_path | /var/tmp/agisoft/rlm_server |
| session manager plugin | session-manager-plugin |
| temporary clean config | /usr/lib/tmpfiles.d/tmp.conf |
| bastion | https://eu-west-1.console.aws.amazon.com/ec2/home?region=eu-west-1#InstanceDetails:instanceId=i-0d9cda5515b1a94bf |
Configure session manager
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/ubuntu_64bit/session-manager-plugin.deb" -o "session-manager-plugin.deb"
sudo dpkg -i session-manager-plugin.deb
verify using
session-manager-plugin
Use session manager to create a tunnel
Find the --target id in the EC2 console and make sure to use a correct --profile.
aws ssm start-session --target i-082d90fd4bee31a0c --document-name AWS-StartPortForwardingSession --parameters "{\"portNumber\":[\"5054\"], \"localPortNumber\":[\"5054\"]}" --profile dev --region eu-west-1
You are now able to use the browser to connect to license server webpage.
Generate an offline activation file
Download the license activation offline archive to the server. Navigate to the activate folder in the license directories. Run this command:
./activate 1111-2222-3333-4444 /var/tmp/request-vm.act --offline --product rlm_server_enable_vm --no-rehost
replace the 1111-2222-3333-444 with the license secret for FLS
Note: make sure you use the corrent indentation and new lines in the license file. This makes a difference.
This command will create two files request-vm.act and request-fls.act, these need to be send to agisoft support. Mention whether the rlm_tmp_path is empty or not. Also include the rlmdiag.txt file.
Errors
We are invalid
If the following message occurs in the rlmdiag.txt file
rlm: 10/31 12:45 (agisoft) (expected: license=XXXXX, we are: invalid)
The first step is to check whether the rlm_tmp_path is empty.
[root@ip-10-75-0-10 /]# ls /var/tmp/agisoft/rlm_server/
Do-NOT-Touch-Anything-in-This-RLM-Directory
If this path contains nothing or does not exist we need to contact Agisoft. Give them the following information:
- Create the rlmdiag.txt file using the license server webpage.
- Attach the file from the license directories to the communicating email to Agisoft
- Mention the rlm_tmp_path is empty
Await their revocation of the FLS key. Once revocation is performed use the steps described in license activation using the secrets stored in license secret.
Note: make sure do add the ISV port to the activated license file, by adding port=number. The number can be found in the Status page of license server webpage. Port can also be added to the rlm license file, but as of written these docs I am not sure it makes a real difference.
Lastly, restart the server by using:
systemctl restart rlm-server
Can't load image
This error occurs during photogrammetry processing on the matchPhotos() step. This is caused by a broken/incomplete image on the EFS. This happens occasionally in especially large projects because the lambda feed into EFS isn't completely stable. The image can be restored by removing it from the EFS and rerunning the process.
TODO Move this into lambda
- Connect to the bastion using AWS console
- Move to the mounted EFS, processingzone (
pz_00XXX) images (pz_00XXX.images)
cd /mnt/efs3/data/pz_00XXX/pz_00XXX.images
- List all the files and their filesizes. The command creates a file in the project directory (
pz_00XXX/filesizes.txt). Define the image format in the command like:*.tiffor'*.jpg'
du -h *.tiff > ../filesizes.txt
- Move the file to s3 and download to local machine. You can use any bucket that the ec2 has access to
aws s3 cp ../filesizes.txt s3://nosi-raw-data-prod/99_temp/filesizes.txt
Note: make sure that aws cli is configured on the machine:
aws configure
- Filter the filesizes.txt using python. We are expecting all files so have similar size of 105M. The function will output simple prints that are
rmcommands.
file_data = '/home/niek/Downloads/filesizes.txt'
with open(file_data, 'r') as file:
data = file.readlines()
for row in data:
if not row.startswith('105M'):
filename = row.split('\t')[1].replace('\n', '')
print(f"rm {filename}")
- Copy paste the list of
rmcommands to the bastion console
rm 20230916_03_RGB_01253.tiff
- Restart the process. You can repeat the same procedure when the lambda task DownloadCamerasEFS has finished to verify if the files are complete now.