Apache Hudi Table Types: CoW vs MoR
Apache Hudi offers two main table types to satisfy different data lake requirements: Copy on Write (CoW) and Merge on Read (MoR). Choosing the right table type is crucial for optimizing your data lake's performance regarding write latency, read query performance, data freshness and bussiness use cases.
Copy On Write (CoW)
In the Copy On Write storage type, data is stored exclusively in columnar file formats (e.g., Parquet). Updates simply rewrite the files with the new values using a synchronous merge during the write means it rewrites the entire parquet foles on updates for fast reads but slow writes, this is idea for batch ingestion and analytics use cases i.e OLAP.
Configuration for CoW Table:
1df.write.format("hudi")
2 .option("hoodie.datasource.write.table.type", "COPY_ON_WRITE")
3 .option("hoodie.table.name", tableName)
4 .save(basePath)Concepts
- Storage: Data in columnar formats i.e parquet files only, updates rewrite the files with the new values using a synchronous merge during the write means it rewrites the entire parquet files on updates for fast reads but slow writes
- Writes:Higher write latency/cost due to file rewriting
- Reads:Fast, as queries only read clean, full Parquet files (no logs to merge).
- Best For:Read-heavy analytics (OLAP), daily batch pipelines, data marts.
- Compaction:No compaction needed.
Pros:
- Simplest operational model.
- Best read performance.
- No compaction needed.
- Higher write latency.
- Higher write amplification.
Merge On Read (MoR)
Merge On Read stores data using a combination of columnar (Parquet) and row-based (Avro) file formats. Updates are logged to delta files and compacted later to create new versions of columnar files this process is called compaction and this idea will help to reduce the size of the data files and improve the read performance after compaction finished after some commits or as per the Configuration of compaction process trigger.
Configuration for MoR Table:
1df.write.format("hudi")
2 .option("hoodie.datasource.write.table.type", "MERGE_ON_READ")
3 .option("hoodie.table.name", tableName)
4 .save(basePath)Concepts
- Storage: Base Parquet files + separate Avro (row-based) log files for changes.
- Writes:Lower write latency, as updates go to log files - this make faster write i.e lower write amplification and compaction merges them later.
- Reads:Slower before compaction, as queries need to merge Parquet files with log files. but after compaction happens, reads become faster.
- Best For:High-frequency updates, delete or CDC and required near real-time data needs.
- Compaction:Compaction needed.
Pros:
- Lower write latency.
- Lower write amplification.
- Near real-time data availability.
- Higher read latency.
- Operational complexity (compaction management).
Comparison: CoW vs MoR
| Feature | Copy on Write (CoW) | Merge on Read (MoR) |
|---|---|---|
| Data Storage | Parquet files only | Parquet + Avro |
| Write Mechanism | Updates trigger the rewriting of entire base files. This happens synchronously during the write operation. | Updates are logged to delta files and compacted later to create new versions of columnar files this process is called compaction and this idea will help to reduce the size of the data files and improve the read performance after compaction finished after some commits or as per the Configuration of compaction process trigger. |
| Read Mechanism | Queries read only the base files, requiring no dynamic merging. | Queries need to merge Parquet files with log files to provide up-to-date data. This can be slower before compaction. |
| Data Freshness | Less Frequent | Near Real-time |
| Compaction | Not Required | Required (Async/Sync) otherwise read performance will be affected. |
| Use Case | Read-heavy workloads, Batch processing | Write-heavy workloads, Streaming |
How to Start with Apache Huid tables creation, visit our Getting Started with Apache Hudi
Need help choosing the right Architecture?
Contact Avocado Datalake for expert consultation on designing scalable data lakes.
Visit our product pages for more information and contact us for a free consultation.


