Overall, there is no difference between preparing a dataset for AI and preparing a dataset for the myriads of other tech tools and business uses. By AI here, we mean Generative (Gen) AI-next-best-word, probability & correlation-based tools. Traditional analytical tools need normalised data (third normal form etc.) using Structured Query Language (SQL) with potential ambiguities removed in the design. Gen AI is useful to analyse text documents which are, by definition, unstructured, but it still acts as a glorified "word cloud" linking items together by frequency and probability. The main hospitality systems - PMS, POS, Spa, RMS, CRS, CRM - were built on normalized databases precisely to prevent mismatches, duplication, and gaps. Even then, duplicate guest profiles remain common, eg when a frequent guest changes email or surname. The better PMSes use fuzzy matching to ease de-duplication, but this remains a governance issue.