Building Identification Details

This page will be updated with data and steps that depend on the Python Notebook (.ipynb) files located in my github here: Python Notebooks - Building Identification and Python - Paperspace Converted Code as well as many other sources.

The basic project line was as follows: Decide it would be cool to have a website that would let you scroll around a Google Maps satellite image to wherever you wanted, click a button, and have a trained ML algorithm perform semantic segmentation on all the buildings.

Leaving the data wrangling, conversion, processing, augmentation, and formatting to another section, the website pipeline is as follows:

This section pertains to the Cloud App portion of the website.

First Stage: Create the static satellite image & send to the storage bucket:

User is presented with a static pre-API-call Google Maps JavaScript window.

User activates the Google Maps JavaScript window, and is assigned a unique (based on browser identity), impermanent Firebase userID.

Firebase authenticates Google API key, activates window, initializes a count variable to zero.

User scrolls around, zooms in, out, decides they've found a spot.

During the above step, the map object is constantly listening and resstablishing the center, zoom, and bounds of the JavaScript map object.

User presses on the "Run Building Identification on current Map View!" button.

The counter is incremented, bounds are split into north, south, east, and west corners, ML serving button is hidden.

Using the earlier established center and zoom values, create a string to be used in the next asynchronous function.

Using the impermanent userID from firebase, establish the firebase.storage() location that the static image is going to be sent to.

Fetch the Static Map image from Google Static Maps API, once received, send to the 'satellite_screenshots' folder within the Google Cloud Storage bucket, appending the userID & count variables (as described in the preceeding step).

Meanwhile, the above steps are wrapped in an asynchronous function, which performs the following process:

Begin an asynchronous function to fetch and store the static image.

Simultaneously, begin the first asyncCharacterPrint function, which provides user-feedback data to let them know that the static image has been retrieved, and sent to the ML processing container.

Wait for receipt of two promises, one for the completion of the static image pull, and the second for the completion of the asyncCharacterPrint.

Once both promises are received, begin the second asyncCharacterPrint call, indicating that the image has been received and MP processing has begun.

This section pertains to the Cloud Run portion of the website.

Second Stage: Run the ML on the image:

First we have to let the Cloud Run instance know that there's a reason for it to spin up, specifically, that a file has been selected by the User, and that it should pull that image and identify as many buildings as it can on it.

This is achieved by having setup a pubsub topic subscrition to monitor the satellite_screenshots folder for either of two status updates: OBJECT_FINALIZE or OBJECT_METADATA_UPDATE.

When either of those status updates occur, a Push notification is sent to the subscriber, which in this case is the service account associated with the cloud-run instance containing the Dockerized Detectron2 model, with post-transfer learning weights for building identification.

The cloud-run instance has an API setup to listen for a POST request, which is expecting a formatted JSON file which includes data relating to the satellite image.

Using Flask, we start off by creating an envelope object formatting the data contained in the POST call and verify the contents of the envelope.

Next the container extracts the bucket_path and file_path locations from the envelope, establishes the storage.Client() object, the bucket object, and the blob object.

Next we link the file_name to the blob.name, and format it to allow for successfully using it again later. We then create two temporary files using mkstemp(), which creates a tuple in the form (int, path). We ignore the int portion.

Now we download the image into the first temporary file, open it, and convert it into a NumPy formatted array.

download the image & convert to NumPy array

Now we finally run the image through the pre-trained, specialized Detectron2 algorithm to get a collection of predictions (in the form of another NumPy array).

Next we use Detectron2 functions to create a Visualizer object, and scale the image down to save on the user-download needs (improves interaction speed).

Then we draw the actual predictions (which are an array of instances, since this is semantic segmentation) and crucially, do so via a CPU (but we could use a GPU instead by redeploying this image on a Google Cloud Google Kubernetes Engine Enterprise Anthos Cloud Run. This would increase costs by a factor of over 10).

Now we grab the formatted image from that the cpu-drawn output.

Having created our prediction file name, we append .png to the second temporary file from earlier. We take out predicted_img object, and convert it into an Image (since right now it's still an array). We then save the Image into our second tempoary file, and identify it as png format.

Finally, we create a pointer to a location in our bucket expecting a new blob object, formatted using the prediction_name established earlier. Then we just upload our second temporary file (now an image file) into that identified location.

This section pertains to the Cloud App portion of the website again.

Third Stage: Display the processed image on the dynamic Google Maps window:

Immediately after the ML processing button as been pressed, the website begins attempting to load the processed image. It does this every .5 seconds It performs repeated getDownloadURL() requests, using the predetermined naming convention.

Once there is an affirmative response of this URL existing where we're looking (which means the image processing is completed and the processed image has been uploaded successfully into the Google Cloud Bucket) we:

Data wrangling, conversion, processing, augmentation, and formatting

Website Pipeline Breakdown