<!--
  ~ Licensed to the Apache Software Foundation (ASF) under one
  ~ or more contributor license agreements.  See the NOTICE file
  ~ distributed with this work for additional information
  ~ regarding copyright ownership.  The ASF licenses this file
  ~ to you under the Apache License, Version 2.0 (the
  ~ "License"); you may not use this file except in compliance
  ~ with the License.  You may obtain a copy of the License at
  ~
  ~   http://www.apache.org/licenses/LICENSE-2.0
  ~
  ~ Unless required by applicable law or agreed to in writing,
  ~ software distributed under the License is distributed on an
  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  ~ KIND, either express or implied.  See the License for the
  ~ specific language governing permissions and limitations
  ~ under the License.
  -->

### Generate Delta Table for Unit Tests

To test Delta Lake ingestion, use the Python script `create_delta_table.py` to generate a sample Delta table.
Create a conda env `delta_test` with all the requirements specified in `requirements.txt` installed in the
environment:
```shell
conda create --name delta_test --file requirements.txt
```

To activate the environment:

```shell
conda activate delta_test
```

From the conda environment, you can run the python script:

```python
python3 create_delta_table.py
```

By default, the script uses `append` mode to generate 10 random records and writes the
Delta table to `resources/employee-delta-table`. You can override the defaults by supplying the command line arguments:

```shell
python3 create_delta_table.py -h

usage: create_delta_table.py [-h] [--delta_table_type {TableType.SIMPLE,TableType.COMPLEX,TableType.SNAPSHOTS}] --save_path SAVE_PATH [--save_mode {append,overwrite}] [--partitioned_by {date,name,id}] [--num_records NUM_RECORDS]

Script to write a Delta Lake table.

options:
  -h, --help            show this help message and exit
  --delta_table_type {TableType.SIMPLE,TableType.COMPLEX,TableType.SNAPSHOTS}
                        Choose a Delta table type to generate. (default: TableType.SIMPLE)
  --save_path SAVE_PATH
                        Save path for Delta table (default: None)
  --save_mode {append,overwrite}
                        Specify write mode (append/overwrite) (default: append)
  --partitioned_by {date,name,id}
                        Column to partition the Delta table (default: None)
  --num_records NUM_RECORDS
                        Specify number of Delta records to write (default: 5)
```

### Non-partitioned table `employee-delta-table`:

The test data in `resources/employee-delta-table` contains 15 Delta records generated over 2 snapshots.
The table was generated by running the following commands:
```shell
python3 create_delta_table.py --save_path=employee-delta-table --num_records=10
python3 create_delta_table.py --save_path=employee-delta-table
```

The resulting Delta table is checked in to the repo. The expectated rows to be used in tests are updated in
`NonPartitionedDeltaTable.java` accordingly.

### Partitioned table `employee-delta-table-partitioned-name`:

The test data in `resources/employee-delta-table-partitioned-name` contains 15 Delta records generated over 3 snapshots.
This table is partitioned by the name column. The table was generated by running the following commands:
```shell
python3 create_delta_table.py --save_path=employee-delta-table-partitioned-name --partitioned_by=name
python3 create_delta_table.py --save_path=employee-delta-table-partitioned-name --partitioned_by=name
python3 create_delta_table.py --save_path=employee-delta-table-partitioned-name --partitioned_by=name
```

The resulting Delta table is checked in to the repo. The expectated rows to be used in tests are updated in
`PartitionedDeltaTable.java` accordingly.

### Complex types table `complex-types-table`:

The test data in `resources/complex-types-table` contains 5 Delta records generated with 1 snapshot.
The table was generated by running the following command:
```shell
python3 create_delta_table.py --save_path=complex-types-table --delta_table_type=complex
```

The resulting Delta table is checked in to the repo. The expectated rows to be used in tests are updated in
`ComplexTypesDeltaTable.java` accordingly.

### Snapshots table `snapshot-table`:

The test data in `resources/snapshot-table` contains 4 Delta snapshots with delete, update and removal of records across
snapshots. The table was generated by running the following command:
```shell
python3 create_delta_table.py --save_path=snapshot-table --partitioned_by=id --delta_table_type=snapshots --num_records=3
```

The resulting Delta table is checked in to the repo. The expectated rows to be used in tests are updated in
`SnapshotDeltaTable.java` accordingly.
