{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Arrow example\n",
    "\n",
    "This example illustrates how to convert Arrow data that contains power-grid-model data to NumPy structured arrays, which the power-grid-model requests.\n",
    "\n",
    "It is by no means intended to provide complete documentation on the topic, but only to show how such conversions could be done.\n",
    "\n",
    "This example uses `pyarrow.RecordBatch` to demonstrate zero-copy operations. The user can choose a `pyarrow.Table` or other structures based on the requirement.\n",
    "\n",
    "**NOTE:** To run this example, the optional `examples` dependencies are required:\n",
    "\n",
    "```sh\n",
    "pip install .[examples]\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture cap --no-stderr\n",
    "from collections.abc import Iterable\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import pyarrow as pa\n",
    "from IPython.display import display\n",
    "from power_grid_model import (\n",
    "    AttributeType,\n",
    "    ComponentAttributeFilterOptions,\n",
    "    ComponentType,\n",
    "    DatasetType,\n",
    "    PowerGridModel,\n",
    "    power_grid_meta_data,\n",
    ")\n",
    "from power_grid_model.data_types import SingleColumnarData"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "## Model\n",
    "\n",
    "For clarity, a simple network is created. More complex cases work similarly and can be found in the other examples:\n",
    "\n",
    "```\n",
    "node 1 ---- line 4 ---- node 2 ----line 5 ---- node 3\n",
    "   |                       |                      |\n",
    "source 6               sym_load 7             sym_load 8\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Single symmetric calculations\n",
    "\n",
    "Construct the input data for the model and construct the actual model.\n",
    "\n",
    "Arrow uses a columnar data format while the power-grid-model offers support for both row based and columnar data format.\n",
    "Because of this, the columnar data format of power-grid-model provides a zero-copy interface for Arrow data. This differs from the row-based data format, for which conversions always require a copy."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### List the power-grid-model data types\n",
    "\n",
    "See which attributes exist for a given component and which data types are used"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "node: {'names': [id, u_rated], 'formats': ['<i4', '<f8'], 'offsets': [0, 8], 'itemsize': 16, 'aligned': True}\n",
      "line: {'names': [id, from_node, to_node, from_status, to_status, r1, x1, c1, tan1, r0, x0, c0, tan0, i_n], 'formats': ['<i4', '<i4', '<i4', 'i1', 'i1', '<f8', '<f8', '<f8', '<f8', '<f8', '<f8', '<f8', '<f8', '<f8'], 'offsets': [0, 4, 8, 12, 13, 16, 24, 32, 40, 48, 56, 64, 72, 80], 'itemsize': 88, 'aligned': True}\n",
      "source: {'names': [id, node, status, u_ref, u_ref_angle, sk, rx_ratio, z01_ratio], 'formats': ['<i4', '<i4', 'i1', '<f8', '<f8', '<f8', '<f8', '<f8'], 'offsets': [0, 4, 8, 16, 24, 32, 40, 48], 'itemsize': 56, 'aligned': True}\n",
      "asym_load: {'names': [id, node, status, type, p_specified, q_specified], 'formats': ['<i4', '<i4', 'i1', 'i1', ('<f8', (3,)), ('<f8', (3,))], 'offsets': [0, 4, 8, 9, 16, 40], 'itemsize': 64, 'aligned': True}\n"
     ]
    }
   ],
   "source": [
    "node_input_dtype = power_grid_meta_data[DatasetType.input][ComponentType.node].dtype\n",
    "line_input_dtype = power_grid_meta_data[DatasetType.input][ComponentType.line].dtype\n",
    "source_input_dtype = power_grid_meta_data[DatasetType.input][ComponentType.source].dtype\n",
    "asym_load_input_dtype = power_grid_meta_data[DatasetType.input][ComponentType.asym_load].dtype\n",
    "print(\"node:\", node_input_dtype)\n",
    "print(\"line:\", line_input_dtype)\n",
    "print(\"source:\", source_input_dtype)\n",
    "print(\"asym_load:\", asym_load_input_dtype)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The primitive types of each attribute in the arrow tables need to match to make the operation efficient.\n",
    "Zero-copy conversion is not guaranteed if the data types provided via the PGM via `power_grid_meta_data` are not used.\n",
    "Note that the asymmetric type of attribute in power-grid-model has a shape of `(3,)` along with a specific type. These represent the 3 phases of electrical system.\n",
    "Hence, special care is required when handling asymmetric attributes. \n",
    "\n",
    "In this example, we use the respective primitive types for the symmetrical attributes and a `FixedSizeListArray` of the primitive types with length 3 for asymmetrical attributes. This results in them being stored as contiguous memory which would enable zero-copy conversion. Other possible workarounds to this are possible, but are beyond the scope of this example."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Creating a Schema\n",
    "\n",
    "We can make the task of assigning types easier by creating a schema based on the `DatasetType` and `ComponentType` directly from `power_grid_meta_data`. \n",
    "They can then directly be used to construct the `RecordBatch`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-------node schema-------\n",
      "id: int32\n",
      "u_rated: double\n",
      "-------asym load schema-------\n",
      "id: int32\n",
      "node: int32\n",
      "status: int8\n",
      "type: int8\n",
      "p_specified: fixed_size_list<item: double>[3]\n",
      "  child 0, item: double\n",
      "q_specified: fixed_size_list<item: double>[3]\n",
      "  child 0, item: double\n"
     ]
    }
   ],
   "source": [
    "def pgm_schema(\n",
    "    dataset_type: DatasetType, component_type: ComponentType, attributes: Iterable[AttributeType] | None = None\n",
    "):\n",
    "    schemas = []\n",
    "    component_dtype = power_grid_meta_data[dataset_type][component_type].dtype\n",
    "    for meta_attribute, (dtype, _) in component_dtype.fields.items():\n",
    "        if attributes is not None and meta_attribute not in attributes:\n",
    "            continue\n",
    "        # The asymmetric attributes are stored as a fixed list array of 3 elements, when dtype.shape is (3,).\n",
    "        pa_dtype = pa.list_(pa.from_numpy_dtype(dtype.base), 3) if dtype.shape == (3,) else pa.from_numpy_dtype(dtype)\n",
    "        schemas.append((meta_attribute, pa_dtype))\n",
    "    return pa.schema(schemas)\n",
    "\n",
    "\n",
    "print(\"-------node schema-------\")\n",
    "print(pgm_schema(DatasetType.input, ComponentType.node))\n",
    "print(\"-------asym load schema-------\")\n",
    "print(pgm_schema(DatasetType.input, ComponentType.asym_load))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create the grid using Arrow tables\n",
    "\n",
    "The [power-grid-model documentation on Components](https://power-grid-model.readthedocs.io/en/stable/user_manual/components.html) provides documentation on which components are required and which ones are optional.\n",
    "\n",
    "Construct the Arrow data as a table with the correct headers and data types. \n",
    "The creation of arrays and combining it in a RecordBatch as well as the method of initializing that RecordBatch is up to the user."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pyarrow.RecordBatch\n",
       "id: int32\n",
       "u_rated: double\n",
       "----\n",
       "id: [1,2,3]\n",
       "u_rated: [10500,10500,10500]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nodes_schema = pgm_schema(DatasetType.input, ComponentType.node)\n",
    "nodes = pa.record_batch(\n",
    "    [\n",
    "        pa.array([1, 2, 3], type=nodes_schema.field(AttributeType.id).type),\n",
    "        pa.array([10500.0, 10500.0, 10500.0], type=nodes_schema.field(AttributeType.u_rated).type),\n",
    "    ],\n",
    "    names=(\"id\", \"u_rated\"),\n",
    ")\n",
    "\n",
    "lines = pa.record_batch(\n",
    "    {\n",
    "        AttributeType.id: [4, 5],\n",
    "        AttributeType.from_node: [1, 2],\n",
    "        AttributeType.to_node: [2, 3],\n",
    "        AttributeType.from_status: [1, 1],\n",
    "        AttributeType.to_status: [1, 1],\n",
    "        AttributeType.r1: [0.11, 0.15],\n",
    "        AttributeType.x1: [0.12, 0.16],\n",
    "        AttributeType.c1: [4.1e-05, 5.4e-05],\n",
    "        AttributeType.tan1: [0.1, 0.1],\n",
    "        AttributeType.r0: [0.01, 0.05],\n",
    "        AttributeType.x0: [0.22, 0.06],\n",
    "        AttributeType.c0: [4.1e-05, 5.4e-05],\n",
    "        AttributeType.tan0: [0.4, 0.1],\n",
    "    },\n",
    "    schema=pgm_schema(\n",
    "        DatasetType.input,\n",
    "        ComponentType.line,\n",
    "        [\n",
    "            AttributeType.id,\n",
    "            AttributeType.from_node,\n",
    "            AttributeType.to_node,\n",
    "            AttributeType.from_status,\n",
    "            AttributeType.to_status,\n",
    "            AttributeType.r1,\n",
    "            AttributeType.x1,\n",
    "            AttributeType.c1,\n",
    "            AttributeType.tan1,\n",
    "            AttributeType.r0,\n",
    "            AttributeType.x0,\n",
    "            AttributeType.c0,\n",
    "            AttributeType.tan0,\n",
    "        ],\n",
    "    ),\n",
    ")\n",
    "\n",
    "sources = pa.record_batch(\n",
    "    {AttributeType.id: [6], AttributeType.node: [1], AttributeType.status: [1], AttributeType.u_ref: [1.0]},\n",
    "    schema=pgm_schema(\n",
    "        DatasetType.input,\n",
    "        ComponentType.source,\n",
    "        [AttributeType.id, AttributeType.node, AttributeType.status, AttributeType.u_ref],\n",
    "    ),\n",
    ")\n",
    "sym_loads = pa.record_batch(\n",
    "    {\n",
    "        AttributeType.id: [7, 8],\n",
    "        AttributeType.node: [2, 3],\n",
    "        AttributeType.status: [1, 1],\n",
    "        AttributeType.type: [0, 0],\n",
    "        AttributeType.p_specified: [1.0, 2.0],\n",
    "        AttributeType.q_specified: [0.5, 1.5],\n",
    "    },\n",
    "    schema=pgm_schema(\n",
    "        DatasetType.input,\n",
    "        ComponentType.sym_load,\n",
    "        [\n",
    "            AttributeType.id,\n",
    "            AttributeType.node,\n",
    "            AttributeType.status,\n",
    "            AttributeType.type,\n",
    "            AttributeType.p_specified,\n",
    "            AttributeType.q_specified,\n",
    "        ],\n",
    "    ),\n",
    ")\n",
    "\n",
    "nodes\n",
    "# the record batches of the other components can be printed similarly"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Convert the Arrow data to power-grid-model input data\n",
    "\n",
    "The Arrow `RecordBatch` or `Table` can then be converted to row based data or columnar data.\n",
    "Converting Arrow data to columnar NumPy arrays is recommended to leverage the columnar nature of Arrow data. \n",
    "This conversion can be done with zero-copy operations.\n",
    "\n",
    "A similar approach can be adopted by the user to convert to row based data instead."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'id': array([1, 2, 3], dtype=int32),\n",
       " 'u_rated': array([10500., 10500., 10500.])}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def arrow_to_numpy(data: pa.RecordBatch, dataset_type: DatasetType, component_type: ComponentType) -> np.ndarray:\n",
    "    \"\"\"Convert Arrow data to NumPy data.\"\"\"\n",
    "    result = {}\n",
    "    result_dtype = power_grid_meta_data[dataset_type][component_type].dtype\n",
    "    for name, column in zip(data.column_names, data.columns):\n",
    "        # The use of zero_copy_only=True and assert statement is to verify if no copies are made.\n",
    "        # They are not mandatory for a zero-copy conversion.\n",
    "        column_data = column.to_numpy(zero_copy_only=True)\n",
    "        if column_data.dtype != result_dtype[name]:\n",
    "            raise TypeError(\n",
    "                f\"Data type mismatch for column '{name}': expected {result_dtype[name]}, got {column_data.dtype}\"\n",
    "            )\n",
    "        result[name] = column_data.astype(dtype=result_dtype[name], copy=False)\n",
    "    return result\n",
    "\n",
    "\n",
    "node_input = arrow_to_numpy(nodes, DatasetType.input, ComponentType.node)\n",
    "line_input = arrow_to_numpy(lines, DatasetType.input, ComponentType.line)\n",
    "source_input = arrow_to_numpy(sources, DatasetType.input, ComponentType.source)\n",
    "sym_load_input = arrow_to_numpy(sym_loads, DatasetType.input, ComponentType.sym_load)\n",
    "\n",
    "node_input"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Construct the complete input data structure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_data = {\n",
    "    ComponentType.node: node_input,\n",
    "    ComponentType.line: line_input,\n",
    "    ComponentType.source: source_input,\n",
    "    ComponentType.sym_load: sym_load_input,\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Optional: validate the input data\n",
    "from power_grid_model.validation import validate_input_data\n",
    "\n",
    "validate_input_data(input_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Use the power-grid-model\n",
    "\n",
    "The `output_component_types` argument is set to `ComponentAttributeFilterOptions.relevant` to given out columnar data.\n",
    "\n",
    "For more extensive examples, visit the [power-grid-model documentation](https://power-grid-model.readthedocs.io/en/stable/index.html).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>energized</th>\n",
       "      <th>u_pu</th>\n",
       "      <th>u</th>\n",
       "      <th>u_angle</th>\n",
       "      <th>p</th>\n",
       "      <th>q</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1.000325</td>\n",
       "      <td>10503.410670</td>\n",
       "      <td>-0.000067</td>\n",
       "      <td>338777.246279</td>\n",
       "      <td>-3.299419e+06</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1.002879</td>\n",
       "      <td>10530.228073</td>\n",
       "      <td>-0.002932</td>\n",
       "      <td>-1.000000</td>\n",
       "      <td>-5.000001e-01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>1.004113</td>\n",
       "      <td>10543.184974</td>\n",
       "      <td>-0.004342</td>\n",
       "      <td>-2.000000</td>\n",
       "      <td>-1.500000e+00</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id  energized      u_pu             u   u_angle              p  \\\n",
       "0   1          1  1.000325  10503.410670 -0.000067  338777.246279   \n",
       "1   2          1  1.002879  10530.228073 -0.002932      -1.000000   \n",
       "2   3          1  1.004113  10543.184974 -0.004342      -2.000000   \n",
       "\n",
       "              q  \n",
       "0 -3.299419e+06  \n",
       "1 -5.000001e-01  \n",
       "2 -1.500000e+00  "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# construct the model\n",
    "model = PowerGridModel(input_data=input_data, system_frequency=50)\n",
    "\n",
    "# run the calculation\n",
    "sym_result = model.calculate_power_flow(output_component_types=ComponentAttributeFilterOptions.relevant)\n",
    "\n",
    "# use pandas to tabulate and display the results\n",
    "sym_node_result = sym_result[ComponentType.node]\n",
    "display(pd.DataFrame(sym_node_result))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Convert the symmetric result to Arrow format"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Converting symmetrical results is straightforward by using schema from [Creating Schema](#creating-a-schema)\n",
    "Using types other than the ones from this schema might make a copy of the data. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pyarrow.RecordBatch\n",
       "id: int32\n",
       "energized: int8\n",
       "u_pu: double\n",
       "u: double\n",
       "u_angle: double\n",
       "p: double\n",
       "q: double\n",
       "----\n",
       "id: [1,2,3]\n",
       "energized: [1,1,1]\n",
       "u_pu: [1.000324825742982,1.0028788641128945,1.004112854674026]\n",
       "u: [10503.410670301311,10530.228073185392,10543.184974077272]\n",
       "u_angle: [-0.00006651843181518038,-0.0029317915196012487,-0.004341587216862092]\n",
       "p: [338777.2462788448,-1.0000002693184182,-1.9999998867105226]\n",
       "q: [-3299418.661306348,-0.5000000701801947,-1.4999998507078594]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pa_sym_node_result = pa.record_batch(\n",
    "    sym_node_result, schema=pgm_schema(DatasetType.sym_output, ComponentType.node, sym_node_result.keys())\n",
    ")\n",
    "pa_sym_node_result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Single asymmetric calculations\n",
    "\n",
    "Asymmetric calculations have a tuple of values for some of the attributes and are not easily convertible to record batches.\n",
    "Instead, one can have a look at the individual components of those attributes and/or flatten the arrays to access all components."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Asymmetric input\n",
    "\n",
    "To illustrate the conversion, let's consider a similar grid but with asymmetric loads.\n",
    "\n",
    "```\n",
    "node 1 ---- line 4 ---- node 2 ----line 5 ---- node 3\n",
    "   |                       |                      |\n",
    "source 6              asym_load 7            asym_load 8\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pyarrow.RecordBatch\n",
       "id: int32\n",
       "node: int32\n",
       "status: int8\n",
       "type: int8\n",
       "p_specified: fixed_size_list<item: double>[3]\n",
       "  child 0, item: double\n",
       "q_specified: fixed_size_list<item: double>[3]\n",
       "  child 0, item: double\n",
       "----\n",
       "id: [7,8]\n",
       "node: [2,3]\n",
       "status: [1,1]\n",
       "type: [0,0]\n",
       "p_specified: [[1,0.01,0.011],[2,2.5,450]]\n",
       "q_specified: [[0.5,1500,0.1],[1.5,2.5,1500]]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "asym_loads_dict = {\n",
    "    AttributeType.id: [7, 8],\n",
    "    AttributeType.node: [2, 3],\n",
    "    AttributeType.status: [1, 1],\n",
    "    AttributeType.type: [0, 0],\n",
    "    AttributeType.p_specified: [[1.0, 1.0e-2, 1.1e-2], [2.0, 2.5, 4.5e2]],\n",
    "    AttributeType.q_specified: [[0.5, 1.5e3, 0.1], [1.5, 2.5, 1.5e3]],\n",
    "}\n",
    "\n",
    "asym_loads = pa.record_batch(\n",
    "    {\n",
    "        AttributeType.id: [7, 8],\n",
    "        AttributeType.node: [2, 3],\n",
    "        AttributeType.status: [1, 1],\n",
    "        AttributeType.type: [0, 0],\n",
    "        AttributeType.p_specified: [[1.0, 1.0e-2, 1.1e-2], [2.0, 2.5, 4.5e2]],\n",
    "        AttributeType.q_specified: [[0.5, 1.5e3, 0.1], [1.5, 2.5, 1.5e3]],\n",
    "    },\n",
    "    schema=pgm_schema(\n",
    "        DatasetType.input,\n",
    "        ComponentType.asym_load,\n",
    "        [\n",
    "            AttributeType.id,\n",
    "            AttributeType.node,\n",
    "            AttributeType.status,\n",
    "            AttributeType.type,\n",
    "            AttributeType.p_specified,\n",
    "            AttributeType.q_specified,\n",
    "        ],\n",
    "    ),\n",
    ")\n",
    "\n",
    "asym_loads"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{id: array([7, 8], dtype=int32),\n",
       " node: array([2, 3], dtype=int32),\n",
       " status: array([1, 1], dtype=int8),\n",
       " type: array([0, 0], dtype=int8),\n",
       " p_specified: array([[1.0e+00, 1.0e-02, 1.1e-02],\n",
       "        [2.0e+00, 2.5e+00, 4.5e+02]]),\n",
       " q_specified: array([[5.0e-01, 1.5e+03, 1.0e-01],\n",
       "        [1.5e+00, 2.5e+00, 1.5e+03]])}"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def arrow_to_numpy_asym(data: pa.RecordBatch, dataset_type: DatasetType, component_type: ComponentType) -> np.ndarray:\n",
    "    \"\"\"Convert asymmetric Arrow data to NumPy data.\n",
    "\n",
    "    This function is similar to the arrow_to_numpy function, but also supports asymmetric data.\"\"\"\n",
    "    result = {}\n",
    "    result_dtype = power_grid_meta_data[dataset_type][component_type].dtype\n",
    "\n",
    "    for name in result_dtype.names:\n",
    "        if name not in data.column_names:\n",
    "            continue\n",
    "        dtype = result_dtype[name]\n",
    "\n",
    "        if len(dtype.shape) == 0:\n",
    "            column_data = data.column(name).to_numpy(zero_copy_only=True)\n",
    "        else:\n",
    "            column_data = data.column(name).flatten().to_numpy(zero_copy_only=True).reshape(-1, 3)\n",
    "        if column_data.dtype != dtype.base:\n",
    "            raise TypeError(f\"Data type mismatch for column '{name}': expected {dtype.base}, got {column_data.dtype}\")\n",
    "        result[name] = column_data.astype(dtype=dtype.base, copy=False)\n",
    "    return result\n",
    "\n",
    "\n",
    "asym_load_input = arrow_to_numpy_asym(asym_loads, DatasetType.input, ComponentType.asym_load)\n",
    "\n",
    "asym_load_input"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Use the power-grid-model in asymmetric calculations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-0.000067</td>\n",
       "      <td>-2.094462</td>\n",
       "      <td>2.094328</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-0.002930</td>\n",
       "      <td>-2.097322</td>\n",
       "      <td>2.091464</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>-0.004338</td>\n",
       "      <td>-2.098733</td>\n",
       "      <td>2.090057</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          0         1         2\n",
       "0 -0.000067 -2.094462  2.094328\n",
       "1 -0.002930 -2.097322  2.091464\n",
       "2 -0.004338 -2.098733  2.090057"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "asym_input_data = {\n",
    "    ComponentType.node: node_input,\n",
    "    ComponentType.line: line_input,\n",
    "    ComponentType.source: source_input,\n",
    "    ComponentType.asym_load: asym_load_input,\n",
    "}\n",
    "\n",
    "validate_input_data(asym_input_data, symmetric=False)\n",
    "\n",
    "# construct the model\n",
    "asym_model = PowerGridModel(input_data=asym_input_data, system_frequency=50)\n",
    "\n",
    "# run the calculation\n",
    "asym_result = asym_model.calculate_power_flow(\n",
    "    symmetric=False, output_component_types=ComponentAttributeFilterOptions.relevant\n",
    ")\n",
    "\n",
    "# use pandas to display the results, but beware the data types\n",
    "display(pd.DataFrame(asym_result[ComponentType.node][AttributeType.u_angle]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Convert asymmetric power-grid-model output data to Arrow output data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pyarrow.RecordBatch\n",
       "id: int32\n",
       "energized: int8\n",
       "u_pu: fixed_size_list<item: double>[3]\n",
       "  child 0, item: double\n",
       "u: fixed_size_list<item: double>[3]\n",
       "  child 0, item: double\n",
       "u_angle: fixed_size_list<item: double>[3]\n",
       "  child 0, item: double\n",
       "p: fixed_size_list<item: double>[3]\n",
       "  child 0, item: double\n",
       "q: fixed_size_list<item: double>[3]\n",
       "  child 0, item: double\n",
       "----\n",
       "id: [1,2,3]\n",
       "energized: [1,1,1]\n",
       "u_pu: [[1.0003248257977395,1.0003243769486854,1.00032436416241],[1.0028803762176162,1.0028710993140408,1.002873078902153],[1.0041143008174032,1.004103358307718,1.0041004935738544]]\n",
       "u: [[6064.146978239599,6064.144257236815,6064.1441797241405],[6079.639179329455,6079.582941090302,6079.594941705461],[6087.119449677845,6087.053114238265,6087.03574771216]]\n",
       "u_angle: [[-0.00006651848125694056,-2.094461573665813,2.09432849798745],[-0.002929883186483061,-2.0973219974462594,2.0914640024381836],[-0.0043376855072092685,-2.098732840554144,2.0900574062078014]]\n",
       "p: [[112925.89463800077,112918.1351708998,113364.09104539294],[-0.9999999600705957,-0.009999974119601074,-0.010999946473583844],[-1.9999999924219376,-2.500000047715548,-450.0000000248862]]\n",
       "q: [[-1099806.4185887817,-1098301.0302391544,-1098302.7942318914],[-0.5000000935833835,-1499.9999999519298,-0.09999992365339339],[-1.500000040797027,-2.499999950281157,-1500.0000000267694]]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ARRAY_2D = 2\n",
    "ASYM_DATA = 3\n",
    "\n",
    "\n",
    "def numpy_columnar_to_arrow(\n",
    "    data: SingleColumnarData, dataset_type: DatasetType, component_type: ComponentType\n",
    ") -> pa.RecordBatch:\n",
    "    \"\"\"Convert NumPy data to Arrow data.\"\"\"\n",
    "    component_pgm_schema = pgm_schema(dataset_type, component_type, data.keys())\n",
    "    pa_columns = {}\n",
    "    for attribute, column_data in data.items():\n",
    "        primitive_type = component_pgm_schema.field(attribute).type\n",
    "\n",
    "        if column_data.ndim == ARRAY_2D and column_data.shape[1] == ASYM_DATA:\n",
    "            pa_columns[attribute] = pa.FixedSizeListArray.from_arrays(column_data.flatten(), type=primitive_type)\n",
    "        else:\n",
    "            pa_columns[attribute] = pa.array(column_data, type=primitive_type)\n",
    "    return pa.record_batch(pa_columns, component_pgm_schema)\n",
    "\n",
    "\n",
    "pa_asym_node_result = numpy_columnar_to_arrow(\n",
    "    asym_result[ComponentType.node], DatasetType.asym_output, ComponentType.node\n",
    ")\n",
    "\n",
    "pa_asym_node_result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Batch data\n",
    "\n",
    "power-grid-model supports batch calculations by providing an `update_data` argument, as shown in [this example](https://power-grid-model.readthedocs.io/en/stable/examples/Power%20Flow%20Example.html#batch-calculation).\n",
    "\n",
    "Both the `update_data` and the output result are similar to the `input_data` and output data in the above, except that they have another dimension representing the batch index: the first index in the NumPy structured arrays.\n",
    "\n",
    "This extra index can be represented in Arrow using a [`RecordBatch`](https://arrow.apache.org/docs/cpp/api/table.html#two-dimensional-datasets) or using any other multi-index data format."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "power-grid-model-io",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.14.2"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}