🚀 T-SQL to dbt Migration Agent Demo

📊 Step 1: Analyzing T-SQL procedure...
{
  "error": "Failed to parse analysis",
  "raw": "```json\n{\n  \"procedure_name\": \"sp_customer_sales_analysis\",\n  \"components\": {\n    \"ctes\": [\n      \"CustomerOrders\",\n      \"CustomerSegments\"\n    ],\n    \"temp_tables\": [\n      \"#CustomerMetrics\"\n    ],\n    \"cursors\": [\n      {\n        \"name\": \"customer_cursor\",\n        \"query\": \"SELECT customer_id FROM #CustomerMetrics WHERE total_sales > 1000\"\n      }\n    ],\n    \"updates\": [\n      \"UPDATE #CustomerMetrics SET total_sales = total_sales * 1.1 WHERE customer_id = @customer_id\"\n    ],\n    \"dynamic_sql\": false\n  },\n  \"business_logic\": \"Aggregates completed orders per customer (total sales, order count, last order date), classifies customers into segments (though the segment CTE is unused), stores the aggregates in a temp table, then boosts total sales by 10% for customers whose sales exceed $1,000 using a cursor-based loop, and finally returns the enriched metrics.\",\n  \"data_flow\": [\n    \"1. CTE `CustomerOrders` aggregates orders by customer_id where order_status = 'completed'.\",\n    \"2. CTE `CustomerSegments` derives a segment label based on total_sales but is not referenced later.\",\n    \"3. Insert aggregated rows from `CustomerOrders` into temp table `#CustomerMetrics`.\",\n    \"4. Open cursor `customer_cursor` to iterate over rows in `#CustomerMetrics` where total_sales > 1000.\",\n    \"5. For each cursor row, update that row\u2019s total_sales by multiplying by 1.1.\",\n    \"6. Close and deallocate the cursor.\",\n    \"7. Return all rows from `#CustomerMetrics`.\"\n  ],\n  \"complexity\": \"moderate\",\n  \"migration_challenges\": [\n    \"Replacing the cursor with a set\u2011based UPDATE (e.g., UPDATE #CustomerMetrics SET total_sales = total_sales * 1.1 WHERE total_sales > 1000).\",\n    \"Eliminating the unused `CustomerSegments` CTE or integrating its logic into the final model.\",\n    \"Translating the temp table into a dbt staging or intermediate model, ensuring the schema matches the original temp table.\",\n    \"Ensuring the final SELECT is represented as a dbt model that can be materialized (view or table).\",\n    \"Handling the business rule that boosts sales only for customers above a threshold without cursor overhead.\",\n    \"Testing that the segment classification logic (if needed later) is preserved in the dbt transformation.\"\n  ]\n}\n```"
}

🔄 Step 2: Converting to dbt models...
{
  "error": "Failed to generate dbt models",
  "raw": "```json\n{\n  \"models\": [\n    {\n      \"name\": \"intermediate_customer_orders\",\n      \"type\": \"intermediate\",\n      \"materialization\": \"view\",\n      \"sql\": \"SELECT\\n  customer_id,\\n  SUM(order_amount) AS total_sales,\\n  COUNT(*) AS order_count,\\n  MAX(order_date) AS last_order_date\\nFROM {{ ref('orders') }}\\nWHERE order_status = 'completed'\\nGROUP BY customer_id\",\n      \"tests\": [\n        \"unique(customer_id)\",\n        \"not_null(customer_id)\",\n        \"not_null(total_sales)\",\n        \"not_null(order_count)\",\n        \"not_null(last_order_date)\",\n        \"accepted_values(order_status, ['completed'])\"\n      ],\n      \"documentation\": \"Aggregates completed orders per customer: total sales, order count, and last order date.\"\n    },\n    {\n      \"name\": \"mart_customer_metrics\",\n      \"type\": \"mart\",\n      \"materialization\": \"view\",\n      \"sql\": \"WITH base AS (\\n  SELECT\\n    customer_id,\\n    total_sales,\\n    order_count,\\n    last_order_date\\n  FROM {{ ref('intermediate_customer_orders') }}\\n)\\nSELECT\\n  customer_id,\\n  CASE WHEN total_sales > 1000 THEN total_sales * 1.1 ELSE total_sales END AS total_sales,\\n  order_count,\\n  last_order_date\\nFROM base\",\n      \"tests\": [\n        \"unique(customer_id)\",\n        \"not_null(customer_id)\",\n        \"not_null(total_sales)\",\n        \"not_null(order_count)\",\n        \"not_null(last_order_date)\",\n        \"accepted_range(total_sales, 0, null)\"\n      ],\n      \"documentation\": \"Final customer metrics with a 10% boost applied to total sales for customers exceeding $1,000. Mirrors the cursor\u2011based update logic from the original T\u2011SQL procedure.\"\n    }\n  ],\n  \"conversion_notes\": [\n    \"Replaced the cursor and temp table with a single set\u2011based UPDATE logic inside the final view.\",\n    \"The unused CustomerSegments CTE was omitted; its logic can be added later if needed.\",\n    \"All staging and intermediate logic is expressed as declarative SQL views, enabling dbt\u2019s dependency graph.\",\n    \"The final mart is materialized as a view; change to `table` or `incremental` if persistence or performance is required.\",\n    \"Assumes a source model named `orders` exists; adjust the `ref('orders')` call if the source name differs.\"\n  ],\n  \"duckdb_optimizations\": [\n    \"Store the final mart as a Parquet table for efficient columnar reads: `{{ config(materialized='table', file_format='parquet') }}` if persistence is desired.\",\n    \"Enable DuckDB's `SET memory_limit='4GB';` to allow larger in\u2011memory operations during compilation.\",\n    \"Use `SET enable_bloom_filter = true;` to speed up joins on `customer_id` if the dataset grows large.\",\n    \"Leverage DuckDB's `SET enable_vectorized_execution = true;` for faster aggregation.\"\n  ],\n  \"validation_queries\": [\n    \"/* Validate row counts match original procedure */\\nSELECT COUNT(*) AS proc_rows FROM {{ ref('mart_customer_metrics') }};\\n\\n/* Validate total sales sum */\\nSELECT SUM(total_sales) AS proc_total_sales FROM {{ ref('mart_customer_metrics') }};\\n\\n/* Compare with a direct T\u2011SQL execution result set (run separately) */\\nSELECT\\n  COUNT(*) AS expected_rows,\\n  SUM(total_sales) AS expected_total_sales\\nFROM (\\n  /* Replace with the original procedure's output table or temp table */\\n  SELECT * FROM #CustomerMetrics\\n) AS expected;\",\n    \"/* Spot\u2011check a few customers */\\nSELECT\\n  customer_id,\\n  total_sales\\nFROM {{ ref('mart_customer_metrics') }}\\nWHERE customer_id IN (1, 2, 3);\"\n  ]\n}\n```"
}

✅ Step 3: Validating conversion...
{
  "error": "Failed to validate",
  "raw": "```json\n{\n  \"logic_preserved\": true,\n  \"confidence_score\": 0.92,\n  \"matched_transformations\": [\n    \"Aggregation of completed orders per customer (total_sales, order_count, last_order_date)\",\n    \"10% boost to total_sales for customers whose original total_sales > 1,000\",\n    \"Set\u2011based implementation replacing the cursor\",\n    \"Use of a view (or table) instead of a temp table to expose the final metrics\"\n  ],\n  \"discrepancies\": [\n    \"Column name mismatch: original uses `order_total`, dbt model references `order_amount`\",\n    \"The unused `CustomerSegments` CTE is omitted in the dbt conversion (not a functional loss but a difference in the model graph)\",\n    \"Materialization difference: original uses a temp table that is dropped automatically; dbt materializes as a view by default, which may affect persistence and performance\",\n    \"Potential precision/scale differences \u2013 original defines `total_sales` as DECIMAL(10,2); dbt does not explicitly set precision, relying on the source column type\",\n    \"No explicit handling of NULLs or zero totals \u2013 both implementations assume the source data is clean\"\n  ],\n  \"missing_logic\": [\n    \"Segment classification logic (VIP/Premium/Standard) is present in the original CTE but not used in the final output; if it is required downstream, it should be added to the dbt mart\",\n    \"Any side\u2011effects of the cursor (e.g., row\u2011by\u2011row locking) are not reproduced, but they are not part of the business logic\"\n  ],\n  \"recommendations\": [\n    \"Align column names: either rename `order_amount` in the source model to `order_total` or alias it in the intermediate model to avoid confusion.\",\n    \"Add an explicit `DECIMAL(10,2)` cast for `total_sales` in the mart to guarantee the same precision as the original procedure.\",\n    \"If persistence or incremental loading is desired, change the mart materialization to `table` or `incremental` and add appropriate `unique_key` and `incremental_strategy` configs.\",\n    \"Include the `CustomerSegments` logic in a separate model if segment information is needed downstream.\",\n    \"Add a test to verify that customers with `total_sales` exactly 1,000 are not boosted, and those just above 1,000 are.\",\n    \"Consider adding a `not_null` test on `order_total`/`order_amount` in the source model to guard against null aggregation results.\"\n  ],\n  \"test_scenarios\": [\n    \"Verify that the row count of the dbt mart matches the row count returned by the original procedure.\",\n    \"Compare the sum of `total_sales` from the dbt mart with the sum returned by the procedure.\",\n    \"Spot\u2011check a set of customers (e.g., IDs 1, 2, 3) to ensure the boost logic is applied correctly.\",\n    \"Test edge cases: a customer with `total_sales` exactly 1,000 (should not be boosted) and a customer with 1,000.01 (should be boosted).\",\n    \"Run the procedure and the dbt model on a dataset with no completed orders to confirm both return zero rows.\",\n    \"Measure performance on a large dataset to confirm that the set\u2011based view is faster than the cursor implementation.\"\n  ]\n}\n```"
}

🎯 Migration Complete!
Confidence Score: 0.0%
