Friday, September 5, 2025

PGLOGICAL in AWS Postgres

 

Setting Up pglogical Replication on AWS Aurora PostgreSQL

Setting up pglogical replication on Aurora PostgreSQL involves configuring both the publisher (the source database) and the subscriber (the target database). Since Aurora is a managed service, many of the administrative tasks are handled for you, but you will still need to configure the RDS parameter group, enable pglogical extension, and set up publication and subscription.

Here’s a step-by-step guide to setting up pglogical replication in AWS Aurora PostgreSQL:


Step 1: Enable pglogical Extension on Aurora PostgreSQL

  1. Log in to the AWS Management Console and navigate to Amazon RDS.

  2. Modify the Aurora DB Cluster to enable the pglogical extension:

    • In the RDS Console, go to Databases > select your Aurora PostgreSQL cluster.

    • Click Modify.

    • In the DB Cluster Parameter Group section, either select an existing parameter group or create a new one.

      • New parameter group: Create one with a name like pglogical-params.

    • Add the following parameters to the parameter group:

      • shared_preload_libraries = pglogical — this loads the pglogical extension.

      • wal_level = logical — required for logical replication.

      • Other optional parameters: Based on your replication setup, you may also adjust max_replication_slots, max_wal_senders, etc. (as discussed earlier).

  3. Apply the parameter group to the Aurora DB Cluster and reboot the cluster to apply the changes.

    Important: Changing parameters like shared_preload_libraries and wal_level will require a cluster reboot.


Step 2: Install pglogical Extension on the Publisher

  1. Connect to your Aurora PostgreSQL instance as the rds_superuser (or other superuser) role using a PostgreSQL client (e.g., psql or pgAdmin).

  2. Run the following command to create the pglogical extension on the publisher database:

CREATE EXTENSION pglogical;
  1. Verify that the extension is installed by running:

SELECT * FROM pg_extension WHERE extname = 'pglogical';

Step 3: Create a Publication on the Publisher

The publication defines which tables (or schemas) will be replicated to the subscriber. You can create a publication to replicate specific tables or entire schemas.

  1. Connect to the publisher database and create a publication.

    For example, to replicate all tables in a schema:

-- Create a publication for all tables in a schema
SELECT pg_create_publication('my_publication', FOR ALL TABLES IN SCHEMA public);

Alternatively, if you want to replicate specific tables, you can define them:

-- Create a publication for specific tables
SELECT pg_create_publication('my_publication', FOR TABLE my_table1, my_table2);
  1. To list the publications:

SELECT * FROM pg_catalog.pg_publication;

Step 4: Enable pglogical Replication on the Subscriber

  1. On the subscriber Aurora PostgreSQL instance, connect as rds_superuser (or another superuser).

  2. Install the pglogical extension if it hasn’t been done already:

CREATE EXTENSION pglogical;
  1. Create a Subscription to link the subscriber to the publisher and start receiving replicated data.

-- Create a subscription
SELECT pg_create_subscription(
    'my_subscription',            -- Subscription name
    'host=publisher_host port=5432 dbname=publisher_db user=replication_user password=your_password',  -- Connection details for the publisher
    'my_publication',             -- The publication name you created on the publisher
    false,                        -- Whether to copy existing data (recommended to set as 'false' after initial data sync)
    true                          -- Whether to start the replication immediately
);
  • Replace publisher_host, publisher_db, replication_user, and your_password with the actual connection details for the publisher.

  • my_publication should be the publication you created on the publisher.

  1. Verify the subscription status:

SELECT * FROM pg_catalog.pg_subscription;

Step 5: Monitor and Manage Replication

Once the publisher and subscriber are set up, you can start monitoring the replication process. Here are some useful queries for monitoring:

  • Check Replication Status:

SELECT * FROM pg_stat_replication;
  • Check pglogical replication status:

SELECT * FROM pg_stat_subscription;
  • Check replication lag (how far the subscriber is behind the publisher):

SELECT 
    pid, 
    usename, 
    application_name, 
    state, 
    sync_state, 
    sent_lsn, 
    write_lsn, 
    flush_lsn, 
    replay_lsn, 
    pg_xlog_location_diff(sent_lsn, replay_lsn) AS replication_lag
FROM pg_stat_replication;
  • Check Replicated Tables:

SELECT * FROM pg_catalog.pg_publication_tables;

Step 6: Handle Failover and Recovery

If the publisher or subscriber fails, replication may pause or failover. AWS Aurora has built-in features for high availability and failover that can be combined with pglogical replication to ensure continued data replication across regions or between different clusters.

  • Multi-AZ Deployment: Aurora’s built-in failover mechanism can automatically promote a replica to be the primary if the primary instance fails.

  • Replication Resume: pglogical will attempt to resume replication once the publisher or subscriber is back online.

  • Cross-Region Replication: You can set up cross-region replication using pglogical to maintain data consistency between different geographical regions.


Other Considerations and Best Practices

  • Data Synchronization: For the initial data sync, you may want to disable replication until the initial load is complete (set the subscription sync parameter to false).

  • Replication Lag: If you have a high write load on the publisher, replication lag might increase. Make sure to monitor pg_stat_replication and pg_stat_subscription to track replication health.

  • Backups: Ensure that you have a regular backup strategy for both the publisher and the subscriber databases.

  • Network Latency: Since pglogical relies on network connectivity, keep network latency low between the publisher and subscriber for faster replication.


Step 7: Testing the Replication Setup

  1. On the Publisher: Insert some test data into the replicated table(s):

INSERT INTO public.my_table (id, name) VALUES (1, 'test_name');
  1. On the Subscriber: Check if the data appears in the replicated table:

SELECT * FROM public.my_table WHERE id = 1;

If everything is set up correctly, the data should appear in the subscriber database.


Final Thoughts:

  • pglogical replication on AWS Aurora PostgreSQL is a powerful tool for flexible and granular data replication. It provides high availability, cross-region replication, and the ability to replicate specific tables or schemas.

  • By adjusting replication-related parameters, you can optimize replication performance to minimize lag and ensure near-real-time data propagation.


What is pglogical replication?

pglogical replication is a logical replication system for PostgreSQL that allows you to replicate only specific parts of your database (tables, schemas, etc.) rather than the entire database. It is more flexible than physical replication, as you can selectively replicate data from one server (the publisher) to one or more servers (the subscribers) in real time.

In simpler terms:

  • Publisher: The database that sends data.

  • Subscriber: The database that receives data.

  • pglogical: The tool that enables the data to move between the publisher and the subscriber.

Key Concepts in pglogical:

  1. Publisher:

    • The database that holds the original data and sends it out for replication.

    • You create a publication in the publisher, which defines what data (tables or schemas) will be replicated to the subscriber.

  2. Subscriber:

    • The database that receives data from the publisher.

    • The subscription in the subscriber tells it where to get the data from the publisher and defines what it should replicate.

  3. Replication:

    • Logical replication means only the changes to the data (inserts, updates, deletes) are replicated, not the entire database.

    • Unlike physical replication, it allows selective replication of data (e.g., you can replicate only certain tables).

pglogical Replication Process:

  1. Setup:

    • On the publisher, you set up a publication (a set of tables or entire schemas).

    • On the subscriber, you create a subscription to that publication.

  2. Data Flow:

    • When data changes (INSERT, UPDATE, DELETE) on the publisher, pglogical replicates those changes to the subscriber in real-time.


Recommended Parameters for Speeding Up Replication:

To ensure fast and efficient replication using pglogical in Aurora PostgreSQL, there are several parameters to adjust:

1. For the Publisher (Aurora DB Cluster):

Parameter Recommended Setting Reason
wal_level logical Required for logical replication.
max_replication_slots Set to at least the number of subscriptions + 1 Ensures enough replication slots for the subscribers.
max_wal_senders Set to at least max_replication_slots + 2 Ensures enough replication connections for WAL streaming.
wal_sender_timeout 60s or higher Timeout for WAL sender to prevent early disconnects in case of replication delay.
max_parallel_workers_per_gather 4 or higher Speed up query performance for replication if you're using parallel query features.

2. For the Subscriber:

Parameter Recommended Setting Reason
max_replication_slots Set to the number of subscriptions + reserve Ensures the subscriber can handle the replication workload.
max_logical_replication_workers At least equal to the number of subscriptions + reserve Allows enough workers to handle multiple subscriptions.
max_worker_processes At least max_logical_replication_workers + 1 Ensures enough worker processes for replication and other tasks.
max_sync_workers_per_subscription 2 or higher Enables faster synchronization during the initial data copy phase.
max_parallel_apply_workers_per_subscription 2 or higher Allows more parallelism for applying changes to the subscriber.

Other Performance Tips:

  • Use SSD storage: For both publisher and subscriber, SSD-backed storage can improve I/O performance and reduce replication lag.

  • Optimize network latency: Minimize network delays between the publisher and the subscriber, as this can affect the real-time replication speed.

  • Batch updates: For bulk updates or inserts, try to use batch operations instead of row-by-row operations to speed up replication.


Useful and Interesting Features of AWS pglogical Replication:

  1. Easy Setup in AWS RDS/Aurora:

    • AWS provides a managed environment for setting up pglogical replication. You can enable it on Aurora PostgreSQL with minimal configuration, without needing to install additional extensions manually.

  2. Selective Replication:

    • With pglogical, you can replicate only specific tables or even specific rows (using filtering). This allows for more granular control over what data gets replicated, saving resources and improving performance.

  3. Cross-Region Replication:

    • AWS Aurora Global Databases can be combined with pglogical to replicate data across different AWS regions. This enables disaster recovery and low-latency access to your data from different parts of the world.

  4. Support for Parallel Replication:

    • pglogical supports parallel replication, which can help speed up the replication of large data sets, especially during initial syncs or when there are large transaction volumes.

  5. Logical Replication for Upgrades:

    • pglogical can be used to replicate data during PostgreSQL version upgrades or cluster migrations. You can create a new cluster and use pglogical to replicate data from the old cluster, ensuring zero downtime during the migration.

  6. Failover and High Availability:

    • When combined with AWS RDS Multi-AZ or Aurora's multi-region feature, pglogical replication helps ensure high availability, as the replication can automatically continue even if one instance fails.

  7. No Need for Shared Storage:

    • Unlike physical replication (which requires shared storage between nodes), logical replication with pglogical allows you to replicate between independent nodes, providing more flexibility in deployment.

  8. Change Data Capture (CDC):

    • pglogical provides change data capture for incremental data replication, allowing systems like analytics engines to consume data in near real-time, without the overhead of replicating full tables.

  9. Multi-master Replication (Experimental):

    • pglogical supports multi-master replication (although experimental), meaning you can have multiple publishers, each with its own set of subscriptions. This can be useful for complex distributed systems.


Quick Summary:

  • Publisher: Sends data.

  • Subscriber: Receives data.

  • pglogical: The tool that enables flexible and selective replication between PostgreSQL databases.

  • Performance Tips: Tuning parameters like max_replication_slots, max_wal_senders, max_worker_processes, and utilizing parallelism can improve replication speed.

  • AWS Features: Easy setup, cross-region replication, support for failover and high availability, and minimal downtime for migrations.


Monday, August 4, 2025

Comparision between AWS RDS PostgreSQL, AWS RDS Oracle, and AWS Aurora PostgreSQL

 Comparison focusing specifically on AWS RDS PostgreSQL, AWS RDS Oracle, and AWS Aurora PostgreSQL. These are all managed services from AWS, but they have key differences in terms of performance, scaling, features, and cost.

Comparison of AWS RDS PostgreSQL vs AWS RDS Oracle vs AWS Aurora PostgreSQL

Category AWS RDS PostgreSQL AWS RDS Oracle AWS Aurora PostgreSQL
Database Engine PostgreSQL (Open-source) Oracle (Proprietary) PostgreSQL-Compatible (based on PostgreSQL)
Licensing No license cost for core engine, pay-as-you-go Licensing fees (costly, based on edition) No separate license fee for PostgreSQL (pay-as-you-go)
Cost Generally lower (only pay for storage and instances) Higher due to licensing, compute, and storage Higher than RDS PostgreSQL, but more cost-efficient than RDS Oracle
Scalability Horizontal scaling via read replicas; vertical scaling with instance type changes Horizontal scaling via RAC (Real Application Clusters), vertical scaling Horizontal scaling via Aurora Replicas (up to 15 replicas); automatic scaling
Performance Good performance, but may require manual tuning for higher performance Optimized for large-scale enterprise workloads, heavy transactional performance Superior performance, up to 2x faster than standard PostgreSQL due to Aurora's distributed storage architecture
Replication Supports read replicas and cross-region replication Supports read replicas, Oracle Data Guard (for multi-region) Supports Aurora Replicas with high availability and fast failover
Availability Multi-AZ support for high availability Multi-AZ support; Advanced features like Data Guard for HA Built-in high availability with 6 copies of data across 3 AZs
Durability Automated backups, point-in-time recovery (PITR) Advanced backup, Flashback, RMAN Automated backups, continuous backups to Amazon S3
Backup & Recovery Automated backups, snapshots, point-in-time recovery Flashback, advanced backup with RMAN, point-in-time recovery Continuous backups, automatic failover, point-in-time recovery
Performance Tuning Manual tuning required for best performance Advanced tuning options available (partitioning, indexing) Automatic storage scaling and optimization; best for read-heavy workloads
Data Models Relational, JSONB, Full-text search, Geospatial Relational, OLAP, XML, Advanced partitioning Relational, JSON, and PostgreSQL features, optimized for high-read workloads
Security Encryption at rest/in-transit, IAM integration Encryption, Database Vault, TDE, advanced security features Encryption at rest/in-transit, IAM roles, VPC peering
Compliance Suitable for general compliance needs Strong compliance with HIPAA, PCI DSS, SOX, etc. Suitable for many enterprise workloads; offers compliance for regulated industries
HA/DR (High Availability/Disaster Recovery) Multi-AZ support for high availability Advanced disaster recovery features like Data Guard and RAC Built-in HA with automatic failover within Aurora; 6 copies of data across 3 AZs
Auto Scaling Limited to read replicas and instance resizing Requires manual scaling or RAC Auto-scaling of storage and compute resources, based on demand
Integration with AWS Services Easy integration with other AWS services (Lambda, S3, CloudWatch) Tight integration with Oracle ecosystem Seamless integration with other AWS services (Lambda, S3, Redshift, etc.)
Maintenance & Patches Managed service with automatic patching Managed service with automatic patching, but complex maintenance window Aurora handles patches with minimal downtime; more automation
Replication Latency Typically low, but can be higher for cross-region replication Lower replication latency with RAC and Data Guard Low replication latency, due to Aurora’s distributed architecture
Use Cases Good for standard PostgreSQL workloads, moderate to large apps Best for large enterprise systems needing advanced features like RAC, OLAP, and high compliance Best for high-performance, read-heavy applications, and large-scale web applications
Migration Easy to migrate from PostgreSQL or MySQL More complex migrations due to proprietary nature Easy migration from PostgreSQL (compatible with RDS PostgreSQL)
Enterprise Features Limited to standard PostgreSQL features Enterprise-grade features like partitioning, Flashback, and Real Application Clusters (RAC) Performance enhancements, seamless failover, automatic scaling, and replication for high availability

Key Differences Summary:

  1. Licensing & Cost:

    • RDS PostgreSQL is the most cost-effective, being open-source with no licensing fees.

    • RDS Oracle is significantly more expensive due to licensing fees, especially for enterprise features.

    • Aurora PostgreSQL costs more than RDS PostgreSQL but is designed to be more performant and scalable, especially for large-scale read-heavy workloads.

  2. Performance & Scalability:

    • RDS PostgreSQL performs well for most use cases but might require manual optimization for high-demand applications.

    • RDS Oracle offers excellent performance for enterprise systems, especially for large transaction volumes and highly complex queries.

    • Aurora PostgreSQL outperforms standard PostgreSQL (RDS) by up to 2x, with automatic storage scaling, better replication, and high availability.

  3. High Availability & Replication:

    • RDS PostgreSQL supports Multi-AZ for high availability and read replicas for scaling, but lacks advanced replication features.

    • RDS Oracle supports high availability with RAC and Data Guard, allowing for greater disaster recovery options.

    • Aurora PostgreSQL offers built-in high availability with automatic failover and replication across multiple availability zones.

  4. Use Case Suitability:

    • RDS PostgreSQL is great for startups, developers, and moderate-sized applications that need the flexibility of PostgreSQL without extra complexity.

    • RDS Oracle is a go-to for large enterprises requiring advanced features like OLAP, high compliance, or complex transactional systems.

    • Aurora PostgreSQL shines in performance-critical applications, high-traffic web apps, and cloud-native architectures that need auto-scaling and high availability.

  5. Integration & Maintenance:

    • RDS PostgreSQL integrates easily with AWS services and offers simplified management.

    • RDS Oracle also integrates well with AWS but is generally more complex to manage due to Oracle’s feature set and licensing.

    • Aurora PostgreSQL offers tight integration with AWS services and is designed for minimal downtime and maintenance, with automatic patching and scaling.


When to Choose Each:

  • AWS RDS PostgreSQL: If you need a reliable and cost-effective PostgreSQL option without the complexity of licensing and enterprise features.

  • AWS RDS Oracle: If you require enterprise-level features like RAC, OLAP, or high compliance, and are willing to bear the high licensing and management cost.

  • AWS Aurora PostgreSQL: If you need high-performance, low-latency replication, auto-scaling, and better availability, especially for read-heavy or mission-critical applications.

Wednesday, July 16, 2025

Step-by-Step Guide to become AWS DevOps Engineer

 Becoming an AWS DevOps engineer requires a combination of cloud knowledge, DevOps principles, and hands-on experience. Here's a step-by-step guide to help you along the way:

1. Understand the Basics of Cloud Computing & AWS

  • Learn Cloud Computing Concepts:

    • Understand the basics of cloud computing: what it is, types (IaaS, PaaS, SaaS), and benefits.

    • Learn about AWS and its services, including EC2, S3, VPC, IAM, RDS, etc.

    • Start with the AWS Free Tier to get hands-on experience with AWS services.

  • AWS Certified Cloud Practitioner:

    • This is an entry-level certification that covers AWS Cloud basics, which will give you a solid foundation before diving deeper into more advanced concepts.

    • Resources: AWS Training and Certification (free resources), A Cloud Guru, and AWS's own whitepapers.

2. Learn DevOps Fundamentals

  • Understand DevOps Principles:

    • Learn about Continuous Integration (CI), Continuous Delivery (CD), Infrastructure as Code (IaC), automation, monitoring, and feedback loops.

    • Focus on the cultural and organizational aspects of DevOps.

  • Learn Version Control:

    • Git is the most widely used version control system. Learn how to use Git and GitHub/GitLab/Bitbucket to manage code.

    • Master the concepts of branches, merges, pull requests, and commits.

3. Get Hands-on with AWS Core Services for DevOps

  • Compute and Networking:

    • Learn Amazon EC2, Auto Scaling, and Elastic Load Balancing (ELB) for scaling applications.

    • Familiarize yourself with Amazon VPC, Subnets, NAT Gateways, and Security Groups for networking.

  • Storage:

    • Learn Amazon S3 for object storage.

    • Understand Amazon EBS and Amazon EFS for block and file storage.

  • Database:

    • Learn Amazon RDS for managed relational databases.

    • Explore Amazon DynamoDB for NoSQL databases.

  • Security and Identity:

    • Master IAM (Identity and Access Management) to manage user permissions securely.

    • Learn about AWS KMS (Key Management Service) for managing encryption keys.

  • Monitoring and Logging:

    • Understand CloudWatch for monitoring AWS resources.

    • Learn how to use CloudTrail for logging and auditing API activity.

    • Learn about AWS X-Ray for tracing and debugging applications.

4. Learn Infrastructure as Code (IaC)

  • AWS CloudFormation:

    • Learn how to automate the creation of AWS resources using CloudFormation.

    • Study how to write and manage CloudFormation templates.

  • Terraform:

    • Although not exclusive to AWS, Terraform is widely used for IaC and works across multiple cloud providers.

    • Learn how to create, manage, and provision AWS infrastructure using Terraform.

  • Other Tools:

    • Explore AWS CDK (Cloud Development Kit) to define cloud infrastructure using code in languages like Python, TypeScript, and Java.

5. Learn CI/CD Tools & Practices

  • Jenkins:

    • Jenkins is a popular tool for automating CI/CD pipelines. Learn how to set up Jenkins servers, create Jenkinsfiles, and integrate with AWS services.

  • AWS CodePipeline:

    • AWS provides its own CI/CD service, CodePipeline, which integrates seamlessly with other AWS services like CodeCommit, CodeBuild, and CodeDeploy.

  • Docker and Kubernetes:

    • Learn Docker for containerization and understand the principles of building and running containers.

    • Familiarize yourself with Amazon EKS (Elastic Kubernetes Service) to deploy, manage, and scale Kubernetes clusters.

  • Other Tools:

    • Get hands-on with other tools like GitLab CI/CD, CircleCI, or Travis CI.

6. Learn Automation and Scripting

  • Bash/Shell Scripting:

    • Master the basics of Linux and write automation scripts using Bash or Shell.

  • Python/Scripting:

    • Python is widely used for automating tasks and interacting with AWS services using the Boto3 SDK.

  • AWS CLI:

    • Learn how to use the AWS Command Line Interface (CLI) for automating tasks and managing AWS resources.

7. Get Familiar with Monitoring & Logging

  • AWS CloudWatch:

    • Learn how to monitor and analyze logs and metrics in AWS using CloudWatch.

  • ELK Stack (Elasticsearch, Logstash, Kibana):

    • Although this is not AWS-native, it's a widely-used open-source toolset for monitoring and logging.

  • AWS X-Ray:

    • Learn to monitor and debug distributed applications running on AWS.

8. Gain Experience with Real-world Projects

  • Build a Full DevOps Pipeline:

    • Set up a CI/CD pipeline with GitHub, Jenkins, and AWS (CodePipeline, CodeBuild, CodeDeploy).

  • Containerization and Orchestration:

    • Create a Dockerized application and deploy it using Kubernetes on Amazon EKS.

  • Infrastructure Automation:

    • Automate the deployment of an entire infrastructure stack with CloudFormation or Terraform.

  • Monitoring:

    • Set up monitoring and logging using AWS CloudWatch, CloudTrail, and other tools to ensure system health.

9. Get Certified (Optional but Recommended)

  • AWS Certified DevOps Engineer – Professional:

    • This is the most relevant certification for an AWS DevOps engineer. It validates your skills in continuous delivery, automation, monitoring, and security in AWS environments.

    • Preparation: Take online courses, read AWS whitepapers, and review exam guides.

  • AWS Certified Solutions Architect – Associate:

    • If you’re new to AWS, this certification can also be helpful as it covers broader AWS concepts.

10. Keep Up with Industry Trends

  • Follow AWS Blogs:

    • AWS frequently releases new features, updates, and best practices. Follow the AWS Blog to stay informed.

  • Join Communities:

    • Participate in AWS-related forums, Reddit communities, and LinkedIn groups.

    • Attend AWS re:Invent and other conferences to network and stay updated on trends.

  • Contribute to Open Source:

    • Contributing to open-source projects related to DevOps can help you learn best practices and gain visibility.


Recommended Resources:

  • A Cloud Guru / Linux Academy – Offers great hands-on labs and courses tailored for AWS DevOps roles.

  • AWS Documentation – AWS’s own documentation is comprehensive and updated regularly.

  • Books:

    • “The Phoenix Project” (for DevOps culture)

    • “The DevOps Handbook” (for processes and tools)

    • “Infrastructure as Code” by Kief Morris (for IaC)

Final Thoughts

Becoming an AWS DevOps engineer takes time and dedication. Start by building a strong foundation in AWS and DevOps principles, then get hands-on experience through projects, certifications, and real-world practice. With continuous learning and hands-on experience, you’ll eventually become proficient in automating infrastructure, implementing CI/CD pipelines, and managing scalable, secure applications in the cloud.

Friday, June 13, 2025

Useful AWS DMS Postgres & Oracle SQL Queries

 When working with AWS Database Migration Service (DMS) and migrating between PostgreSQL and Oracle, there are key SQL queries you’ll want to use for monitoring, validating, and troubleshooting. Below is a categorized list of useful SQL queries for both PostgreSQL and Oracle, especially tailored to DMS contexts.


🔍 General Categories

  1. Schema Comparison

  2. Row Count Validation

  3. Data Type Check

  4. Change Data Capture (CDC) Monitoring

  5. Latency and Lag Analysis

  6. DMS-specific Tables & Metadata

  7. Performance Troubleshooting


📘 PostgreSQL Useful Queries

1. List All Tables

SELECT table_schema, table_name
FROM information_schema.tables WHERE table_type = 'BASE TABLE' AND table_schema NOT IN ('pg_catalog', 'information_schema');

2. Row Count per Table

SELECT schemaname, relname AS table_name, n_live_tup AS row_count
FROM pg_stat_user_tables ORDER BY n_live_tup DESC;

3. Check Table Size

SELECT relname AS table_name,
pg_size_pretty(pg_total_relation_size(relid)) AS total_size FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;

4. Check Replication/CDC Lag

If logical replication is enabled:

SELECT pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn,
pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replication_lag_bytes FROM pg_stat_replication;

5. Get Indexes for a Table

SELECT indexname, indexdef
FROM pg_indexes WHERE tablename = 'your_table';

📕 Oracle Useful Queries

1. List All User Tables

SELECT owner, table_name
FROM all_tables WHERE owner NOT IN ('SYS', 'SYSTEM');

2. Row Count per Table

SELECT owner, table_name, num_rows
FROM all_tables WHERE owner NOT IN ('SYS', 'SYSTEM');

num_rows is based on stats — use ANALYZE TABLE or DBMS_STATS.GATHER_TABLE_STATS for current counts.

3. Actual Row Count

SELECT COUNT(*) FROM schema_name.table_name;

4. Table Size in MB

SELECT segment_name, segment_type, bytes / (1024*1024) AS size_mb
FROM dba_segments WHERE segment_type='TABLE' AND owner='YOUR_SCHEMA';

5. Indexes for a Table

SELECT index_name, column_name
FROM all_ind_columns WHERE table_name = 'YOUR_TABLE';

🛠️ DMS Metadata Tables (Query from Target Database)

If DMS is replicating these:

SELECT * FROM awsdms_apply_exceptions;
SELECT * FROM awsdms_heartbeat;

Use these to detect:

  • Apply lag

  • Last replicated change

  • Failed rows or skipped rows


✅ Validation Queries

Compare Row Counts (PostgreSQL vs Oracle)

On PostgreSQL:

SELECT COUNT(*) FROM public.my_table;

On Oracle:

SELECT COUNT(*) FROM my_schema.my_table;

You can automate validation using EXCEPT logic or checksum approaches.


📈 Monitoring Query Example (DMS Task Monitoring)

Use AWS DMS CloudWatch Logs and ReplicationInstanceMetrics to monitor:

  • CDCLatencySource

  • CDCLatencyTarget

  • FullLoadThroughputRowsSource

  • FullLoadThroughputRowsTarget

Wednesday, May 7, 2025

Blue-Green Deployment for Amazon RDS databases

 

Step-by-Step: Blue-Green Deployment for Amazon RDS

🔹 Use Case

  • Safely upgrade schema/engine version.

  • Avoid downtime and rollback risks.

  • Test changes on an exact replica before promoting.


1. Create a Green (Clone) Environment

You have several options depending on your current setup:

Option A: Snapshot and Restore

  1. Take a snapshot of your current RDS (Blue).

  2. Restore a new DB instance from that snapshot (this becomes Green).

    aws rds create-db-instance \
    --db-instance-identifier green-db \ --db-snapshot-identifier snapshot-of-blue

Option B: Read Replica (for supported engines like MySQL/PostgreSQL)

  1. Create a read replica of Blue.

    aws rds create-db-instance-read-replica \
    --db-instance-identifier green-db \ --source-db-instance-identifier blue-db
  2. Promote it to a standalone DB instance (optional if testing writes).

    aws rds promote-read-replica \
    --db-instance-identifier green-db

2. Apply Changes to Green

  • Apply schema changes, engine upgrades, or parameter group updates to the Green database.

  • Use a test application or staging environment to connect to Green and validate functionality.


3. Validate Green

  • Run integration and regression tests.

  • Monitor performance, query plans, indexes, etc.

  • Verify replication (if using a read replica).

  • Ensure no application-breaking changes exist.


4. Redirect Application to Green

Once Green is validated:

Option A: Update the DB Endpoint

  • Update the application/database connection string to point to Green’s endpoint.

  • If you’re using Route 53, change the CNAME for the DB hostname.

Option B: Switchover using Route 53

  • Use a CNAME (e.g., db.myapp.com) pointing to the current RDS instance.

  • Change the CNAME to point to Green DB's endpoint.


5. Monitor Green

  • Use CloudWatch for RDS metrics (CPU, IOPS, connections).

  • Monitor application performance post-switch.

  • Optionally keep Blue as backup for rollback.


6. Decommission Blue (Optional)

  • After confirming stability, delete or snapshot the Blue instance.

    aws rds delete-db-instance \
    --db-instance-identifier blue-db \ --skip-final-snapshot # or provide --final-db-snapshot-identifier

🧠 Best Practices

  • Enable automated backups on both instances.

  • Use parameter groups for environment isolation.

  • Use CloudFormation or Terraform for reproducible infrastructure.

Monday, April 28, 2025

AWS RDS Commands

 Here are some commonly used AWS RDS (Relational Database Service) commands using the AWS CLI, along with practical examples for each:


🔧 1. Create a New RDS Instance

aws rds create-db-instance \
--db-instance-identifier mydbinstance \ --db-instance-class db.t3.micro \ --engine mysql \ --master-username admin \ --master-user-password MySecurePassword123 \ --allocated-storage 20

📝 Creates a MySQL RDS instance with 20GB of storage.


🔍 2. Describe RDS Instances


aws rds describe-db-instances

📝 Lists all your RDS instances and their details.


🛑 3. Stop an RDS Instance (only supported for certain instance types)

aws rds stop-db-instance \
--db-instance-identifier mydbinstance

📝 Stops a running instance to save costs (only for non-multi-AZ).


▶️ 4. Start a Stopped RDS Instance


aws rds start-db-instance \ --db-instance-identifier mydbinstance

📝 Starts an RDS instance previously stopped.


🚫 5. Delete an RDS Instance

aws rds delete-db-instance \
--db-instance-identifier mydbinstance \ --skip-final-snapshot

📝 Deletes the RDS instance. Use --final-db-snapshot-identifier if you want a backup.


💾 6. Create a Snapshot

aws rds create-db-snapshot \
--db-snapshot-identifier mydbsnapshot \ --db-instance-identifier mydbinstance

📝 Creates a manual backup (snapshot) of your DB instance.


🔁 7. Restore from Snapshot

aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier restoredinstance \ --db-snapshot-identifier mydbsnapshot

📝 Restores a new DB instance from a snapshot.


🛡️ 8. Modify DB Instance (e.g., instance class)

aws rds modify-db-instance \
--db-instance-identifier mydbinstance \ --db-instance-class db.t3.small \ --apply-immediately

📝 Changes instance type and applies it immediately.


🌐 9. Create a Parameter Group

aws rds create-db-parameter-group \
--db-parameter-group-name myparamgroup \ --db-parameter-group-family mysql8.0 \ --description "My custom params"

📝 Used to customize DB settings, like enabling slow query logs.


📄 10. List Snapshots

aws rds describe-db-snapshots

📝 Displays available snapshots you can restore from.

Wednesday, March 26, 2025

Oracle OCI & AWS CLI commands

AWS CLI and OCI CLI commands side by side for easier comparison. Below table compares AWS and OCI CLI commands for similar actions across both platforms, helping you easily switch between them or use both in your multi-cloud environment.

AWS CLI Command OCI CLI Command
Configure AWS CLI: aws configure Configure OCI CLI: oci setup config
EC2: List Instances: aws ec2 describe-instances Compute: List Instances: oci compute instance list --compartment-id <compartment_ocid>
EC2: Start an Instance: aws ec2 start-instances --instance-ids <instance_id> Compute: Create an Instance: oci compute instance launch --compartment-id <compartment_ocid> --availability-domain <availability_domain> --shape <instance_shape> --image-id <image_ocid> --subnet-id <subnet_ocid> --display-name <instance_name>
EC2: Stop an Instance: aws ec2 stop-instances --instance-ids <instance_id> Compute: Stop an Instance: oci compute instance action --instance-id <instance_ocid> --action STOP
S3: List Buckets: aws s3 ls Object Storage: List Buckets: oci os bucket list --compartment-id <compartment_ocid>
S3: Upload a File: aws s3 cp <file> s3://<bucket_name>/<file_name> Object Storage: Upload a File: oci os object put --bucket-name <bucket_name> --name <object_name> --file <local_file_path>
S3: Download a File: aws s3 cp s3://<bucket_name>/<file_name> <file> Object Storage: Download a File: oci os object get --bucket-name <bucket_name> --name <object_name> --file <local_file_path>
IAM: List Users: aws iam list-users IAM: List Users: oci iam user list --compartment-id <compartment_ocid>
IAM: Create a User: aws iam create-user --user-name <user_name> IAM: Create a User: oci iam user create --compartment-id <compartment_ocid> --name <user_name> --description "<description>"
Lambda: List Functions: aws lambda list-functions Database: List Databases: oci db system list --compartment-id <compartment_ocid>
CloudWatch: List Alarms: aws cloudwatch describe-alarms Monitoring: List Alarms: oci monitoring alarm list --compartment-id <compartment_ocid>
CloudFormation: Describe Stacks: aws cloudformation describe-stacks Load Balancer: List Load Balancers: oci lb load-balancer list --compartment-id <compartment_ocid>
RDS: Describe DB Instances: aws rds describe-db-instances Load Balancer: Create a Load Balancer: oci lb load-balancer create --compartment-id <compartment_ocid> --shape <load_balancer_shape> --subnet-id <subnet_ocid> --display-name <load_balancer_name>
DynamoDB: List Tables: aws dynamodb list-tables Block Volume: List Volumes: oci bv volume list --compartment-id <compartment_ocid>
Route 53: List Hosted Zones: aws route53 list-hosted-zones Block Volume: Attach a Volume: oci bv volume attach --volume-id <volume_ocid> --instance-id <instance_ocid> --device <device_name>
SNS: List Topics: aws sns list-topics Autonomous Database: List Autonomous Databases: oci db autonomous-database list --compartment-id <compartment_ocid>
ECR: List Repositories: aws ecr describe-repositories Cloud Guard: List Security Problems: oci cloud-guard problem list --compartment-id <compartment_ocid>
Cost Explorer: Get Usage: aws ce get-cost-and-usage --time-period Start=2025-01-01,End=2025-01-31 --granularity DAILY --metrics "AmortizedCost" Database: Create a Database: oci db system launch --compartment-id <compartment_ocid> --availability-domain <availability_domain> --shape <db_shape> --subnet-id <subnet_ocid> --display-name <db_system_name> --database-edition <database_edition> --admin-password <admin_password>
STS: Get Caller Identity: aws sts get-caller-identity Compute: List Instances: oci compute instance list --compartment-id <compartment_ocid>
CloudTrail: Lookup Events: aws cloudtrail lookup-events --max-results 10 Compute: Stop an Instance: oci compute instance action --instance-id <instance_ocid> --action STOP
CloudFront: List Distributions: aws cloudfront list-distributions Object Storage: Upload a File: oci os object put --bucket-name <bucket_name> --name <object_name> --file <local_file_path>

Below is a table comparing AWS and OCI commands related to Networking. These commands cover networking services like Virtual Private Cloud (VPC) in AWS and Virtual Cloud Network (VCN) in OCI. This table compares networking-related AWS CLI and OCI CLI commands for similar tasks like managing VPCs/VCNs, subnets, security groups, route tables, VPNs, and more. For both platforms, these commands allow you to configure, manage, and troubleshoot your networking infrastructure effectively.

AWS CLI Command OCI CLI Command
VPC: Create VPC: aws ec2 create-vpc --cidr-block <cidr_block> VCN: Create a VCN: oci network vcn create --compartment-id <compartment_ocid> --cidr-block <cidr_block> --display-name <vcn_name>
VPC: List VPCs: aws ec2 describe-vpcs VCN: List VCNs: oci network vcn list --compartment-id <compartment_ocid>
VPC: Create Subnet: aws ec2 create-subnet --vpc-id <vpc_id> --cidr-block <cidr_block> VCN: Create a Subnet: oci network subnet create --compartment-id <compartment_ocid> --vcn-id <vcn_ocid> --cidr-block <cidr_block> --availability-domain <availability_domain> --display-name <subnet_name>
VPC: List Subnets: aws ec2 describe-subnets --filters "Name=vpc-id,Values=<vpc_id>" VCN: List Subnets: oci network subnet list --compartment-id <compartment_ocid> --vcn-id <vcn_ocid>
VPC: Create Internet Gateway: aws ec2 create-internet-gateway VCN: Create an Internet Gateway: oci network internet-gateway create --compartment-id <compartment_ocid> --vcn-id <vcn_ocid> --display-name <ig_name>
VPC: Attach Internet Gateway: aws ec2 attach-internet-gateway --vpc-id <vpc_id> --internet-gateway-id <ig_id> VCN: Attach an Internet Gateway: oci network vcn attach-internet-gateway --vcn-id <vcn_ocid> --internet-gateway-id <internet_gateway_ocid>
VPC: Create Route Table: aws ec2 create-route-table --vpc-id <vpc_id> VCN: Create a Route Table: oci network route-table create --compartment-id <compartment_ocid> --vcn-id <vcn_ocid> --display-name <route_table_name>
VPC: List Route Tables: aws ec2 describe-route-tables --filters "Name=vpc-id,Values=<vpc_id>" VCN: List Route Tables: oci network route-table list --compartment-id <compartment_ocid> --vcn-id <vcn_ocid>
VPC: Create Route: aws ec2 create-route --route-table-id <route_table_id> --destination-cidr-block <cidr_block> --gateway-id <gateway_id> VCN: Create a Route in Route Table: oci network route-rule create --route-table-id <route_table_id> --destination <cidr_block> --network-entity-id <gateway_ocid>
VPC: Modify Route Table: aws ec2 replace-route --route-table-id <route_table_id> --destination-cidr-block <cidr_block> --gateway-id <gateway_id> VCN: Update Route Table: oci network route-rule update --route-table-id <route_table_id> --route-rule-id <route_rule_id> --destination <cidr_block> --network-entity-id <gateway_ocid>
VPC: Delete Subnet: aws ec2 delete-subnet --subnet-id <subnet_id> VCN: Delete Subnet: oci network subnet delete --subnet-id <subnet_ocid>
VPC: Delete Internet Gateway: aws ec2 delete-internet-gateway --internet-gateway-id <ig_id> VCN: Delete Internet Gateway: oci network internet-gateway delete --internet-gateway-id <internet_gateway_ocid>
VPC: Create Security Group: aws ec2 create-security-group --group-name <group_name> --description <description> --vpc-id <vpc_id> VCN: Create a Security List (OCI equivalent of Security Group): oci network security-list create --compartment-id <compartment_ocid> --vcn-id <vcn_ocid> --display-name <security_list_name>
VPC: List Security Groups: aws ec2 describe-security-groups --filters "Name=vpc-id,Values=<vpc_id>" VCN: List Security Lists: oci network security-list list --compartment-id <compartment_ocid> --vcn-id <vcn_ocid>
VPC: Create Network ACL: aws ec2 create-network-acl --vpc-id <vpc_id> VCN: Create a Network Security Group: oci network network-security-group create --compartment-id <compartment_ocid> --vcn-id <vcn_ocid> --display-name <nsg_name>
VPC: Attach Network ACL: aws ec2 associate-network-acl --network-acl-id <acl_id> --subnet-id <subnet_id> VCN: Attach a Network Security Group: oci network network-security-group add-vnics --network-security-group-id <nsg_id> --nics <nic_ocid>
VPC: Create VPC Peering: aws ec2 create-vpc-peering-connection --vpc-id <vpc_id> --peer-vpc-id <peer_vpc_id> VCN: Create VCN Peering: oci network vcn-peering-connection create --compartment-id <compartment_ocid> --vcn-id <vcn_ocid> --peer-vcn-id <peer_vcn_ocid> --display-name <peering_name>
VPC: Accept VPC Peering: aws ec2 accept-vpc-peering-connection --vpc-peering-connection-id <peering_id> VCN: Accept VCN Peering: oci network vcn-peering-connection accept --vcn-peering-connection-id <peering_ocid>
VPC: Delete VPC Peering: aws ec2 delete-vpc-peering-connection --vpc-peering-connection-id <peering_id> VCN: Delete VCN Peering: oci network vcn-peering-connection delete --vcn-peering-connection-id <peering_ocid>
VPC: Create VPN Gateway: aws ec2 create-vpn-gateway --type <vpn_type> --amazon-side-asn <asn> VCN: Create a VPN Gateway: oci network vpn-gateway create --compartment-id <compartment_ocid> --vcn-id <vcn_ocid> --display-name <vpn_gateway_name>
VPC: Attach VPN Gateway: aws ec2 attach-vpn-gateway --vpc-id <vpc_id> --vpn-gateway-id <vpn_gateway_id> VCN: Attach a VPN Gateway: oci network vpn-gateway attach --vcn-id <vcn_ocid> --vpn-gateway-id <vpn_gateway_ocid>
VPC: List VPN Gateways: aws ec2 describe-vpn-gateways VCN: List VPN Gateways: oci network vpn-gateway list --compartment-id <compartment_ocid>

Here’s a comparison table of AWS CLI and OCI CLI commands related to backup and restore from snapshots for both AWS and OCI. These commands cover common actions like creating snapshots, listing snapshots, and restoring from them.

AWS CLI Command OCI CLI Command
EC2: Create Snapshot (Backup an EBS volume): aws ec2 create-snapshot --volume-id <volume_id> --description "<description>" Block Volume: Create Snapshot (Backup an EBS volume in OCI): oci bv volume-create-backup --volume-id <volume_id> --display-name "<snapshot_name>" --description "<description>"
EC2: Describe Snapshots (List snapshots): aws ec2 describe-snapshots --owner-ids <owner_id> Block Volume: List Snapshots (List backups of block volumes): oci bv backup list --compartment-id <compartment_ocid>
EC2: Delete Snapshot (Delete a snapshot): aws ec2 delete-snapshot --snapshot-id <snapshot_id> Block Volume: Delete Snapshot (Delete a backup): oci bv backup delete --backup-id <backup_id>
EC2: Restore from Snapshot (Create volume from snapshot): aws ec2 create-volume --snapshot-id <snapshot_id> --availability-zone <availability_zone> Block Volume: Restore from Snapshot (Create a new volume from a snapshot): oci bv volume restore --backup-id <backup_id> --availability-domain <availability_domain> --compartment-id <compartment_ocid>
RDS: Create Snapshot (Backup an RDS instance): aws rds create-db-snapshot --db-instance-identifier <db_instance_id> --db-snapshot-identifier <snapshot_name> Autonomous Database: Create a Backup (Autonomous DB snapshot in OCI): oci db autonomous-database backup create --autonomous-database-id <autonomous_db_ocid> --display-name <backup_name>
RDS: Describe Snapshots (List RDS snapshots): aws rds describe-db-snapshots --db-instance-identifier <db_instance_id> Autonomous Database: List Backups (List backups of Autonomous Database): oci db autonomous-database backup list --compartment-id <compartment_ocid>
RDS: Delete Snapshot (Delete an RDS snapshot): aws rds delete-db-snapshot --db-snapshot-identifier <snapshot_id> Autonomous Database: Delete a Backup (Delete Autonomous DB backup): oci db autonomous-database backup delete --autonomous-database-id <autonomous_db_ocid> --backup-id <backup_id>
RDS: Restore from Snapshot (Restore an RDS instance from snapshot): aws rds restore-db-instance-from-db-snapshot --db-instance-identifier <new_db_instance_id> --db-snapshot-identifier <snapshot_id> Autonomous Database: Restore from Backup (Restore Autonomous DB from backup): oci db autonomous-database restore-from-backup --autonomous-database-id <autonomous_db_ocid> --backup-id <backup_id>

Key Points:

  • AWS EC2 primarily uses EBS snapshots for backup and restore of volumes. You can create a volume from a snapshot to restore it.

  • AWS RDS provides an option to create and restore database snapshots for RDS instances.

  • OCI Block Volume uses volume backups (backups of block volumes) to manage snapshots, and you can restore from backups to a new volume.

  • OCI Autonomous Database allows backup creation and restore for Autonomous Databases via backups.

Both AWS and OCI offer similar snapshot and restore functionalities but with slight differences in the specific commands and services they offer for database and block volume backups.


Thursday, March 20, 2025

Oracle DGMGRL Swithover & Switchback steps

 To perform a switchover and switchback using Oracle Data Guard with dgmgrl (Data Guard Manager command-line interface), you typically use the following steps. These steps assume you have already configured Oracle Data Guard and are familiar with the concepts of Primary and Standby databases.

1. Switchover from Primary to Standby Database

A switchover is a planned role reversal between the Primary and Standby databases. This process allows the Standby database to become the new Primary and vice versa.

Steps for Switchover:

  1. Log in to Data Guard Manager (DGMGRL): Open a command-line terminal on the Primary database machine and log into the Data Guard Manager using dgmgrl.

    dgmgrl
    
  2. Connect to the Data Guard Configuration: Connect to your Data Guard configuration.

    DGMGRL> connect sys/password@primary
    

    Replace sys/password with the appropriate credentials and the correct TNS name for your primary database.

  3. Check the Data Guard Status: Before proceeding, it's good practice to check the status of the Data Guard configuration.

    DGMGRL> show configuration;
    

    Verify that the configuration is in sync and there are no issues.

  4. Perform the Switchover: Execute the switchover command to initiate the role reversal.

    DGMGRL> switchover to standby;
    

    Replace standby with the correct name of the standby database if needed.

    This command will:

    • Automatically stop Redo Apply on the Standby.
    • Start Redo Apply on the new Primary.
    • Synchronize the new Primary with the old Standby.
    • If necessary, restart the Oracle instances to complete the switchover.

    After the switchover completes, the old Primary will be the new Standby, and the old Standby will be the new Primary.

  5. Verify the Switchover: Check that the switchover was successful by viewing the configuration status.

    DGMGRL> show configuration;
    

    Ensure that the roles have been reversed correctly.


2. Switchback from Standby to Primary Database

Switching back is simply performing a reverse of the original switchover process, where the Primary database becomes the Standby again.

Steps for Switchback:

  1. Log in to Data Guard Manager (DGMGRL): Open a terminal on the current Primary (which was previously the Standby) and log into the Data Guard Manager using dgmgrl.

    dgmgrl
    
  2. Connect to the Data Guard Configuration: Connect to the Data Guard configuration.

    DGMGRL> connect sys/password@primary
    
  3. Check the Data Guard Status: Check the current status to ensure everything is running smoothly.

    DGMGRL> show configuration;
    
  4. Perform the Switchback: Execute the switchover command again, but this time switch back to the original Primary (which is now the Standby).

    DGMGRL> switchover to primary;
    

    This command will:

    • Reverse the roles of the Primary and Standby databases once again.
    • Stop Redo Apply on the current Standby and resume it on the new Standby.
    • Synchronize the databases.
  5. Verify the Switchback: Check the Data Guard configuration to verify the new roles.

    DGMGRL> show configuration;
    

Things to Consider:

  • Synchronization: Ensure that the database logs are synchronized before performing any switchover. You can use the show database verbose command to check the synchronization status.

  • Redundancy and Backup: Always ensure you have current backups of the databases before performing a switchover or switchback, as these operations can potentially cause downtime.

  • Log Archives: If you have the possibility of switching back during the switchover process, verify that all the archive logs have been applied correctly.

  • Data Guard Broker: If you're using Data Guard Broker, these operations can be performed automatically without manual intervention using DGMGRL commands.


By following these steps, you can successfully perform a switchover and switchback operation with Oracle Data Guard using dgmgrl. Always perform these operations during planned maintenance windows to minimize downtime.

For a comprehensive and official reference on performing switchover and switchback operations in Oracle Data Guard using dgmgrl, I recommend checking Oracle's official documentation:

  • Oracle Data Guard Concepts and Administration Guide: This is the most authoritative source for Data Guard operations, including switchover and switchback procedures.

You can find the relevant documentation here: Oracle Data Guard Concepts and Administration

You can specifically refer to the following sections for detailed steps on switchover and switchback:

  • Chapter 5: Switchover and Failover Procedures (for steps on switchover and switchback)
  • Chapter 10: Data Guard Command-Line Interface (DGMGRL) (for DGMGRL command usage)

Oracle Data Guard Broker, 19c