MMoBai

FlavorGraph screening

Screening for molecularly compatible flavours

The molecular layer maps consumer-preferred flavours into an embedding space and ranks candidate pairings by how compatible they are with the flavours consumers already like. It does not re-discover consumer preferences; it amplifies them, narrowing thousands of molecular candidates to a small, human-testable shortlist.

Pipeline architecture

Consumer VoiceVADER, TextBlob, TF-IDF, K-Means. Input: consumer posts. Output: net sentiment per flavour, segments, pain points.
Molecular AI (FlavorGraph)metapath2vec on 8,298 nodes and 147,179 edges (Park et al., 2021). Output: a 300-dim embedding per node.
Pairing-Compatibility RecommenderCosine similarity to a consumer-liked anchor. Output: compatibility ranking and a human-testable shortlist.
Product ScienceFive-mechanism off-note masking. Input: the shortlisted variants. Output: the MoBai formulation.

FlavorGraph knowledge graph

Graph nodes8,2986,653 ingredients + 1,645 compounds
Edges147,179molecular co-occurrence relationships
Embedding300-dim8,297 nodes with vectors
Screened for Variant C8,279candidate ingredient nodes

For MoBai, thirteen target flavour nodes were mapped to the graph. Oolong and osmanthus are absent from the vocabulary, so the milk-tea profile in Variant B uses black tea and milk as a proxy, noted as a known limitation.

Explore the molecular graph

Click any node. Ingredients link to their representative aroma compounds; compounds that share chemical families link to each other.

Mangoethyllimonenebeta-myrceneisoamylJasminelinaloolbenzylindolecis-jasmoneCoconutdelta-decalactonegamma-nonalactonecaprylicmethylheptanoneMilk TeacaffeinetheaflavinpyrazinegeraniolcatechinYogurt Baselacticdiacetylacetoinacetaldehyde

Variant compatibility

Both primary variants sit in the HIGH affinity tier and are well separated from the strawberry baseline, confirming the screen discriminates as intended. These two variants are the first outputs of the screen, not its limit. The dashed line marks the mean compatibility of the thirteen consumer flavours.

HIGH0.70 and aboveStrong molecular alignment with the consumer-preferred space
MID-HIGH0.50 to 0.69Moderate alignment, warrants sensory exploration
MID0.35 to 0.49Weak alignment
LOWbelow 0.35Poor molecular fit with the consumer-liked anchor

Live compatibility check

Pick ingredients and the score is computed in your browser as cosine similarity to the consumer-liked anchor, the same logic used to score the variants. This is a molecular compatibility ranking, not a liking prediction.

Variant ingredient
Reference flavour
Variant C candidate
Explore

Select up to three ingredients. Molecular pairing-compatibility is the cosine similarity of the combined vector to the consumer-liked anchor. This is a ranking screen, not a liking prediction.

Compatibility
0.73
HIGH affinity
0.350.500.70

Fresh Mango + Jasmine Tea

Variant C discovery

Beyond confirming the two primary variants, the recommender screened the full ingredient vocabulary for a third candidate. Crushed pineapple ranked first at 0.61 in the MID-HIGH tier. It was not part of the anchor, so this is a genuine generalisation of the engine to a flavour it was never calibrated on.

Transparency. The in-sample correlation between compatibility and the consumer sentiment used to build the anchor is Pearson 0.6. Because the anchor is built from the same flavours it is correlated against, this is a descriptive consistency figure, not an out-of-sample predictive claim. Liking is validated separately by the consumer survey in the next stage.