…Despite being a chilly & wintery March up here in the White Mountains, there is no shortage of fun birds and exciting projects!

So many Redpolls are keeping the Juncos company this year! Two pairs of Hooded Mergansers moved in just next door last week!
Use AWS S3 as a Joplin sync target!
Local Sharp-shinned hawks & Fluffy Red Foxes have been busy careening around town gobbling up prey left and right- they seem to know Spring is right around the corner!

It’s happening, and its going to be awesome visit this project over here on GitHub

Overview:

venv:

python3 -m venv mushroomobserver_venv source mushroomobserver_venv/bin/activate pip3 install -r requirements.txt

*Artifacts:*	train.tgz	test.tgz
images.tgz	images.json	gbif.zip

python3 preprocess

Fetches & saves off gbif archive to ./static/
- Checks the archive, tries loading it into memory etc
Fetches Leaflet Annotator binary & licenses from JessSullivan/MerlinAI-Interpreters; Need to commit annotator (as of 03/16/21) , still fussing with a version for Mushroom Observer
Generates an images.json file from the 500 assets selected by Joe & Nathan
Downloads, organizes the 500 selected assets from images.mushroomoberver.org at ./static/images/<category>/<id>.jpg
- writes out images archive
More or less randomly divvies up testing & training image sets
- writes out example testing/training archives; (while training it’ll probably be easier to resample directly from images.tgz from keras)

python3 train

Fetches, divvies & shuffles train / validation sets from within Keras using archive available at mo.columbari.us/static/images.tgz
More or less running Google’s demo transfer learning training script in train/training_v1.py as of 03/17/21 , still need to bring in training operations and whatnot from merlin_ai/ repo —> experiment with Danish Mycology Society’s ImageNet v4 notes

Google Colab:

@gvanhorn38 pointed out Google Colabs’s neat Juptyer notebook service will train models for free if things are small enough- I have no idea what the limits are- fiddle with their intro to image classification on Google Colab here, its super cool!

Jupyter:

Leaflet Annotatorimages.json Structure:
- id : taxonID The MO taxon id
- category_id : The binomen defined in the ./static/sample_select_assets.csv; for directories and URIs this is converted to snake case.
- url : Temporary elastic ip address this asset will be available from, just to reduce any excessive / redundant traffic to images.mushroomobserver.org
- src : imageURL The asset’s source URL form Mushroom Observer [{ “id”: “12326”, “category_id”: “Peltula euploca”, “url”: “https://mo.columbari.us/static/images/peltula_euploca/290214.jpg” “src”: “https://images.mushroomobserver.org/640/290214.jpg” }]
Selected asset directory structure:

├── static ├── gbif.zip ├── images | … │ └── peltula_euploca │ ├── 290214.jpg │ … │ └── 522128.jpg │ … ├── images.json ├── images.tgz ├── js │ ├── leaflet.annotation.js │ └── leaflet.annotation.js.LICENSE.txt └── sample_select_assets.csv …

Fiddling with the archive:

MODwca.gbif[1].id: Integer: This is the Mushroom Observer taxon id, e.g.
- https://mushroomobserver.org/13
- https://images.mushroomobserver.org/640/13.jpg
MODwca.gbif[1].data:: Dictionary: DWCA row data, e.g.
- MODwca.gbif[1].data['http://rs.gbif.org/terms/1.0/gbifID'] = 13
- MODwca.gbif[1].data['http://rs.tdwg.org/dwc/terms/recordedBy'] = Nathan Wilson

Bits & Bobs, Mushstools & Toadrooms