We answered frequently asked questions below.
Q1: How do you use all these labeled data?
- We use the data to train an AI model for recognizing smoke emissions. The model is a deep neural network that can learn how to classify videos into two categories: having smoke or no smoke. Then, we use the model to recognize smoke for many dates and camera views (refer to our event page).
Q2: Why do you need help from volunteers?
- While deep neural networks have been proven useful in various applications (e.g., object recognition), training such networks requires a considerable amount of labeled data. Annotating all the data will take hundreds of hours for one reseacher, which is why we need help from volunteers.
Q3: Why is it important to recognize and visualize smoke emissions?
- Our previous work in air quality monitoring shows that visualizing evidence of smoke emissions can influence the attitude of regulators. Also, using such visual data increased the community's confidence when addressing air pollution.
Q4: Where do these video clips come from?
- We selected and cropped several windows into videos from the Breathe Cams network (as shown in the following image). Each video contains 36 frames, which represent about 6 minutes in real-world time.
Q5: Why my labels did not pass the quality check? How did you define the quality?
- For each batch (16 videos) on the page, the system randomly placed several videos with known answers, also called gold standards. A batch will pass the quality check if you label these gold standards correctly.
Q6: How does the system know if smoke emission is present in a video?
- The system defines the final label by aggregating answers from citizens and researchers. At least two volunteers or one researcher will review each video. If the answers from the two volunteers agree, the system marks the video according to the agreement. Otherwise, another volunteer or researcher will review the video, and the result is aggregated based on majority voting.
Q7: Why sometimes a dialog box popped up and asked me to enable video autoplay?
- During labeling, videos need to play automatically. If a mobile device has data saver enabled, videos will stop autoplay. Also, some mobile devices pause videos after users wake it up from sleeping mode. To enable autoplay, browsers require user interactions, which is why the system shows the dialog box.
Q8: Why sometimes I saw similar videos? Were they the same?
- Videos that have closer times (e.g., 8 and 8:10 am) can look similar due to the same weather and lighting conditions. Also, gold standard videos for the quality check can appear again if you label many batches.
Q9: Can I build a similar system with your code?
- This project is open-sourced on GitHub. Please feel free to reuse the code.
Q10: Are there other actions that I can take to advocate for better air quality?
- We recommend checking the materials for smoke reading (EPA Method 9). We also recommend using the Smell Pittsburgh (or Smell MyCity if not in Pittsburgh) application to report pollution odors to the local health department. More resources can be found on Mark Dixon's website.
Q11: Why does this tool not support devices older than Android 7 and iOS 11?
- When labeling smoke, this tool shows 16 videos at the same time. Older devices have difficulties in playing these videos, which results in poor user experiences.
Q12: Why are there no nighttime videos to label?
- Smoke emissions in nighttime videos (captured by commercial digital cameras) are tough for the computer to recognize due to insufficient light. We want to focus on training the computer to recognize daytime smoke emissions first.
Q13: Who are your local partners?
- We worked with the Breathe Collaborative in engaging citizens to label smoke emissions. We thank the Clean Air Council and GASP (Group Against Smog and Pollution) in organizing workshops for labeling smoke events with us.
Q14: Why are some dates and camera views missing on the event page?
- Due to unexpected situations (e.g., severe weather, spider, system malfunction), the camera network can be down at some dates, or some views may be unclear. We removed these dates and views in our visualization.
Q15: Why do some videos on the event page have no smoke?
- The AI model can make mistakes. For example, steam and fast-moving cloud shadow may be recognized as industrial smoke emissions. One way to mitigate the problem is to train the model using more annotated videos. We would greatly appreciate it if you could help us label more videos.
Q16: Why is the most recent date on the event page not up to the current date?
- Processing videos involve many manual steps. We have to inspect the views for each date and check if they are in good condition. Then, we need to specify the location (bounding box) on the panorama. Due to such limitations, we are currently unable to automate the smoke recognition system fully.