Somtimes it is required to fetch the EMR master IP dynamically for automation.
One such use case is connecting AWS Sagemaker notebook instances to EMR with Livy. For this, you need to have the EMR master’s private IP address to be configured in the Sparkmagic configurations.
Prerequisites
- Python3
- Boto3 installation
- Valid AWS CLI access to EMR Cluster services.
Python boto3 Script To Retrieve EMR Master IP
Here is the python script that will retrieve the EMR master’s private IP address using the EMR cluster id as a parameter.
Note: This script returns the IP of the first master Node which is part of the Master Node instance group.
import boto3
import json
boto_client_emr = boto3.client("emr")
cluster_id = "j-DFGDSFGREWG"
def get_emr_master_pvt_ip(boto_client_emr, cluster_id):
emr_list_instance_rep = boto_client_emr.list_instances(
ClusterId=cluster_id,
InstanceGroupTypes=[
'MASTER',
],
InstanceStates=[
'RUNNING',
]
)
return emr_list_instance_rep["Instances"][0]["PrivateIpAddress"]
emr_master_ip = get_emr_master_pvt_ip(boto_client_emr, cluster_id)
print("EMR master IP is" + " " + emr_master_ip)
Module Requirements
- This script uses AWS boto3 python module.
- JSON module
Script Input
The only input required for this script is the EMR cluster-ID. You can get this ID from the EMR cluster dashboard as shown below.
In the script it is shown as cluster_id = "j-DFGDSFGREWG"
. When you execute the script, replace the ID with your EMR cluster ID.
Script Usage
The following line of code stores the master IP in emr_master_ip
variable.
emr_master_ip = get_emr_master_pvt_ip(boto_client_emr, cluster_id)
You can use this variable as per your requirement.
get_emr_master_pvt_ip
is the method which accepts the cluster ID variable.
If you execute the given code, you will get an output with a message showing the Master IP address. It comes from the print statement in the script.
Use python3 to execute the script as shown below.
➜ python3 emr.py
EMR master IP is 10.0.0.201