How it connects
One engine, one continuous loop
The stages form a single loop, designed so that no component carries more inferential weight than it can support. Consumer voice identifies what people want and what fails them. The molecular layer ranks the pairings most compatible with those preferences. The survey closes the loop by testing whether the compatibility signal translates into real preference, and the GREEN gate confirms that it does for this panel and these concepts. The same loop can be re-run on new data and new flavour territories.
Data pipeline summary
| Component | Details |
|---|---|
| English tweets | 5,021 across 6 query groups |
| China posts | 1,649 across 3 platforms (Xiaohongshu, Weibo, Douyin) |
| FlavorGraph nodes | 8,298 nodes; 8,279 screened for Variant C |
| Survey respondents | n = 34 (APAC urban, aged 25 to 38) |
AI performance metrics
| Metric | Value | Notes |
|---|---|---|
| NLP inter-model agreement (English) | 0.635 | Mean confidence, VADER vs TextBlob |
| NLP inter-model agreement (Chinese) | 0.731 | Mean confidence, RoBERTa-JD vs SnowNLP |
| Clustering k selection | k = 5 | Silhouette optimisation over k in 2 to 7 |
| FlavorGraph HIGH-tier compatibility | 0.73 to 0.79 | Variant A and Variant B |
| FlavorGraph LOW-tier compatibility | 0.26 | Strawberry discriminative baseline |
| Survey validation (Spearman r) | 0.90 (p = 0.037) | Compatibility vs mean liking, n = 34 |
| Survey tier separation | HIGH 6.9 / LOW 4.3 | 9-point hedonic scale |