Mosss Research Wiki

Problem Domains
1. Getting Big Data Sets
  1. Data Programming: Creating Large Training Sets Quickly
  2. Data Augmentation with Keras
  3. Bootstrap the Houzz Dataset
  4. Deep learning requires a lot of data. As a lean startup, can we bootstrap our pre-existing data assets?
2. Object Detection
  1. py-R-FCN Object Detection
  2. You only look once: Unified, real-time object detection (2016), J. Redmon et al. [pdf]
    1. YOLO_object_detection.pdf
  3. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015), S. Ren et al. [pdf]
    1. faster_r_cnn.pdf
  4. Mask R-CNN (2017), K. He et al. [pdf]
    1. mask_r_cnn.pdf
  5. Grokstyle API
  6. Scikit Image
    1. Dense Daisy Feature Extractor
    2. Histogram of Oriented Gradients
      1. Sliding window histogram
    3. Template Matching
    4. GLCM Texture Features
    5. Gabors / Primary Visual Cortex “Simple Cells” from an Image
      1. Gabor filter banks for texture classification
    6. Shape Index - eigenvalues of Hessian
    7. Local Binary Pattern for texture classification
    8. Otsu Image Labeling
    9. Watershed Segmentation
  7. We need to know what is and isn't in an image of a living room so we can recommend relevant content
  8. Ideally, we could automatically draw multiple bounding boxes around every furniture/accessory related object and label it by object class, color, material, brand, form, and style.
  9. GAP: Can we rely on Grokstlye and Google Vision to avoid rolling our own?
  10. Markable?
  11. Vicenze
  12. Look for pro's who have authored papers in this domain and hire them.
    1. 1 - 3 month contracts
3. Describing images with words
  1. Deep visual-semantic alignments for generating image descriptions (2015), A. Karpathy and L. Fei-Fei [pdf]
    1. Karpathy_Deep_Visual-Semantic_Alignments_2015_CVPR_paper.pdf
  2. Show, attend and tell: Neural image caption generation with visual attention (2015), K. Xu et al. [pdf]
    1. show_attend_tell.pdf
  3. Dynamic memory networks for visual and textual question answering (2016), C. Xiong et al. [pdf]
    1. dynamic_memory_visual_QA.pdf
  4. Mosss needs to be able to map written descriptions to visual queries and vice versa. This would create a delightful AI product that could learn by interacting with users.
  5. Ideally, we would have seamless integration between text inputs and mood board creation. Should be like talking to a professional designer.
  6. GAP: Learned embedding spaces.
    1. Rosetta Stone for humans and style language
    2. Chatbots
4. Scene Understanding
  1. Fully convolutional networks for semantic segmentation (2015), J. Long et al. [pdf]
    1. cnn_semantic_segmentation.pdf
  2. Single-view Spatial Understanding
    1. Geometric Context - Derek Hoiem (CMU)
    2. Recovering Spatial Layout - Varsha Hedau (UIUC)
    3. Geometric Reasoning - David C. Lee (CMU)
    4. RGBD2Full3D - Ruiqi Guo (UIUC)
  3. Sci-kit image
    1. RAG Thresholding
    2. Chan-Vese Segmentation
    3. Otsu Thresholding
    4. Measuring Region Properties
    5. Random Walker Segmentation
    6. Markers for Watershed Transform
    7. RAG merging
      1. Hierarchical RAG Merging
  4. Any room or livining environment has a lot going on in it. Objects, arrangement, colors, materials, functional elements etc. More concretely, there are objects, surfaces, walls, floors, and windows. Tangential to object detection is the problem of focusing attention on discrete areas within an image. How do we segment regions within an image so we can decompose the design process?
  5. Ideally, we could automagically segment regions of a room. Our design recommendations would be based on an ensemble of models that consider each region independently, interactions between region subsets, and the scene as a whole.
  6. Matterport 3D model/camera
  7. GAP: Doubts regarding data, technical, and temporal resources to invest in this domain.
5. Representing Style/Material
  1. Learning Representations from Time Series Data through Contextualized LSTMs
  2. Large-Scale Machine Learning through Spectral Methods: Theory & Practice
  3. Unsupervised representation learning with deep convolutional generative adversarial networks (2015), A. Radford et al. [pdf]
    1. unsupervised_representation_learning.pdf
  4. Wasserstein GAN (2017), M. Arjovsky et al. [pdf]
    1. wasserstein_gan.pdf
  5. Learning to discover cross-domain relations with generative adversarial networks (2017), T. Kim et al. [pdf]
    1. disco_gans.pdf
  6. Generative visual manipulation on the natural image manifold (2016), J. Zhu et al. [pdf]
    1. natural_image_manifold.pdf
  7. Texture networks: Feed-forward synthesis of textures and stylized images (2016), D Ulyanov et al. [pdf]
    1. texture_nets.pdf
  8. At the crux of visual recommendation is the question, what is style? How can we represent schools of thought, design, and style mathematically?
  9. Ideally, we would have embedding spaces for furniture/accessory objects, style classes (boho, industrial, glam, chic), user temperments, and logistical parameters (budget, square footage). We would then use these embeddings to tailor recommendations/solutions that are viable/optimal across all embeddings.
  10. GAP: Doubts regarding data, technical, and temporal resources to invest in this domain.
6. Propensity Modeling
  1. Frequentist model - Bag of Furniture
  2. Furniture N-grams
    1. Chair next to couch and coffe-table
  3. Taste Profile + Style Quiz ---> Rule based decision tree
  4. What is the likelihood of a user purchasing object X?
  5. Send list of ingredients to Bjarke and SOM
  6. Talk to Gilda
    1. Label ingredients in an image
    2. Draw Grokstyle box around each image
    3. Save Grokstyle results to a folder that inherits information from moodboard
  7. Get on Upwork
7. Recommender Systems
  1. Towards Conversational Recommender Systems
  2. We plan for users to interact with Mosss visually (upload images) and verbally (style quiz, chatbot, etc.). There are also opportunities to leverage data to get more user information (FB, Pinterest, Instagram, Google, etc.) How do we leverage all of this information to provide meaningful recommendations to our users?
  3. Ideally, our recommendations would intelligently use enough data from diverse data sources to optimize cost effectiveness.
  4. GAP: We're currently relying on heuristics (color, material/texture), API's, and rule based systems (style quiz) to filter user recommendations. We need machine vision to learn the visual nuances of style/taste and chatbots to converse with users so we can span the full desing/consultation process.
8. Style Transfer
  1. Deep Photo Style Transfer (2017), F. Luan et al. [pdf]
    1. deep_style_transfer.pdf
  2. If we know enough about a users taste, can we take their inspiration images and input and generate ideal furniture and accessories, even if those objects don't exist yet. THis capability could be used to find the best products that match the ideal . Potentially marketable information to designers and brands.
Collect
Describe
Discover
Predict
Advise
Data Science Hierarchy of Needs
1. Subtopic 1
What's the Problem
1. Think about the real purpose
2. Ideal condition
3. Find out the gap
Analysis of Current Situation
1. Vague concept
2. Quantification
3. Details
4. Does the approach make sense?
5. Does the answer make sense?
6. Does the analysis address the original intent?
7. Is the story complete?
8. Where next?
Set Goals
1. Discuss
2. Sub Goals
3. Quantifiable targets
Find out the Reasons
1. Why?
2. use tools to analyze
3. Find out the reason from the phenomenon
Solutions
1. Who
  1. Person 1
  2. Person 2
2. Where
3. When
4. How
Implement
1. Go with plans
2. Check effect of Implementation
3. Stop useless solutions