Amazon AWS Certified Database Specialty – Amazon ElastiCache Part 2

June 13, 2023

7. Redis global datastore

Now let’s talk about the redis global data store. So Global Data store is something that allows you to create crossregion replicas for your redis cluster. So with redis global data store, you have one Writer cluster and multiple reader clusters. So the Writer cluster is the primary cluster, whereas the other reader clusters are called as second clusters. And you can replicate to up to two other regions. So you can have maximum of three regions in your global data store. And this definitely improves your local read latency because it brings data closer to your end users. And this also useful for Dr or disaster recovery. And you have to manually promote a secondary cluster to be the primary.

It is not automatic. And the global data store is only available for multinade clusters. It’s not available for a single node cluster. So you must convert your cluster to a replication group before you can create a global data store. And security for cross region communication is provided through VPC pairing. So these clusters can communicate through VPC pairing. But there are certain limitations when you use global data store. For example, clusters cannot be modified or resized. As usual, you scale the clusters by modifying the global data store, and all the member clusters get scaled. So if you scale one cluster, then all other clusters that are part of the global data store will get scaled as well.

So to modify a global data store’s parameters, you modify the parameter group of any one of the member clusters, and the change gets applied to all the member clusters automatically. And the replication latency is typically under 1 second. So the data is replicated cross region in under 1 second. And this is a typical data and it’s not an SLA, so this is not guaranteed. But in general, typically you will see a 1 second latency. And this also means that RPO is under 1 second. So amount of data lost due to any outage will be typically 1 second. And the typical RPO, or the typical downtime after an outage, is about 1 minute. All right, so RPU is about 1 second and RPU is about 1 minute.

8. Redis – Good things to know

Now, let’s talk about some of the good things to know about redis. Now, replica lag may grow or shrink over time and if a replica is too far behind the primary, then you can reboot it to bring it in sync with the primary instance. In case of latency or throughput issues, scaling out the cluster can help. Similarly, if you expedia experienced memory pressure, then scaling out will help. If the cluster is overscaled, then you can scale in to reduce your costs. And in case of online scaling, remember that the cluster remains available but with some performance degradation.

And the level of degradation will depend on the amount of CPU utilization and the amount of data in your cluster. And another important thing that you should keep in mind is you cannot change the redis cluster mode after you create it. So if you have redis cluster mode disabled cluster, then you cannot convert it into a cluster mode enabled cluster. But what you can do is you can create a new cluster and you can warm start it with your preexisting data. And remember that all nodes within a cluster are always off the same instance type. All right, so that’s about it. Let’s continue to the next lecture.

9. Redis best practices

Now let’s talk about Redis best practices. So when you use the Cluster mode or the Cluster mode enabled cluster, you should connect using the configuration endpoint. So when you use the configuration endpoint, it allows for auto discovery of shard and key space mappings, or in other words, if any of the shards get added or removed, you don’t have to make any changes to your application. So whenever any shards get added or removed, configuration endpoint gets updated automatically. In case of Cluster mode disabled cluster, you should use the Primary endpoint for Writes and Reader endpoint for Reads and this is also always kept up to date with any cluster changes, just like the Configuration endpoint with Cluster mode enabled cluster. And it’s a good idea to set the Reserved memory percent parameter to about 25%. So this is used for background processes or non data processes.

And when you use Reserved Memory percent of about 25%, you will not experience any performance issues with your cluster. Again, the socket timeout should be kept to 1 second at least. If you put it too low then there could be numerous timeouts when you have high load. If you keep it too high, then your application might take longer to detect any connection issues. So ideal recommendation is to use a socket timeout of about 1 second. Again the DNS Caching timeout or the TTL should be kept as low as possible. About five to 10 seconds of TTL is recommended for DNS Caching and you should never use the Cache Forever option for DNS Caching because if your application is caching the DNS information and if there are any changes to your cluster, then your application is going to experience connectivity issues.

10. Redis use cases

Now let’s look at some of the use cases for Redis press the gaming leaderboards. So the gaming leaderboards typically use redis sorted sets. So sorted sets is something that automatically stores your data in a sorted fashion. For example, you could use this to store top ten scores of a game. So you have your redis cluster, different players across the globe are playing the game, so data is flowing to redis and you can use Redis to generate a real time leaderboard of the top ten scores. And since we are using sorted sets, the leaderboard data is automatically sorted. So you always see top scores in real time. The second use case is Pub sub messaging or messaging queues. So you have your Redis Pub sub server.

You can have different publishers sending messages to the reddish server and different subscribers then subscribe to these messages and they receive the messages as the publishers publish them. So different publishers publish the messages and the reddit server stores these messages, let’s say in a form of a queue. And then different subscribers that subscribe to this will receive the messages in the same order. And different subscribers will also be able to choose which messages to subscribe to. So this is one of the use cases of Pub sub messaging or having publishers publish messages and subscribers receive those messages. Another use case is recommendation data.

This particularly uses the increment decrement counters in redis. So for example, you can use redis hashes to maintain a list of who liked or who disliked a particular product. So for example, you can have this data of likes and dislikes coming in from different sources like mobile phones, computers or even IoT devices. And you can push that data to a Kinesis data stream. And this data stream can be consumed by a lambda function which can then write data to your ElastiCache redis cluster. And further, you can use the data stored in ElastiCache in your analytics applications for generating different recommendations. So that’s about the use cases of ElastiCache for redis. Let’s continue to the next lecture.

11. Memcached overview

Now, let’s talk about the second flavor of ElastiCache. That’s memcache. T or memcached? So Memcache D is a simple inmemory key value store that provides you with microsecond latency, or you can also call it as submillisecond latency. All right? And Memcache D supports automatic detection and recovery from cache node failures. So if any of the cache nodes fail, Memcache T can automatically detect this and recover from these failures. And memcache. D supports simple applications. So the typical applications are session store persistent as well as transient session data stores.

And of course, you can use Memcache D to store the results of your database queries, both for relational as well as for non relational or no SQL databases. So RDS or even DynamoDB. You can use Memcache D for Web page caching, for API caching, as well as for object caching, or in other words, image files, video files, and their metadata as well. And this is typically suited for simple applications like web and mobile apps. And you can also use it for gaming, IoT, Ad, tech and ecommerce and so on. So that’s about the overview of Memcache T. Now let’s look at the architecture of Memcache.

12. Memcached architecture

Sir. Just like Redis, memcache T cluster is also placed in a private subnet within your VPC, and you can access it from an EC Two instance, which you can place in a public subnet in the VPC and Mem cache. T allows access only from EC Two network, so your application should be hosted on a whitelisted EC Two instance. You can white list your instances using security groups. And for the Memcache D cluster, you can create up to 20 nodes per cluster. So you don’t have a concept of shards with Memcache D, but you can have up to 20 nodes per cluster, and you can distribute data across these nodes. And memcache. D doesn’t support read replicas. If there is a Nerde failure that’s going to result in a data loss. And to reduce this kind of data loss, you can deploy your nodes in a multiaz setup. All right, so let’s continue to the next lecture.

13. Memcached auto discovery

Now, let’s talk about memcache D auto discovery. Now, this is something that allows your client applications to automatically identify nodes in your mem cache D cluster. So you don’t have to manually connect to individual nodes. You simply connect to any one node using the configuration endpoint and retrieve a list of all other nodes. So let’s say you have have these five nodes in your memcache D cluster and then you connect to one of the nodes and is going to return a metadata containing list of all nodes. And then you can use this metadata or the list of nodes to connect to any of the other nodes.

And this metadata that contains the list of nodes gets updated dynamically whenever you add or remove any nodes from your mem cache T cluster. So any node failures are also automatically detected and the nodes are replaced and the metadata gets updated. And this auto discovery feature is enabled by default on your memcache D cluster. Only thing is, you have to use an auto discovery capable client to connect to your memcache cluster. And if you log into your ElastiCache dashboard within AWS console, you can download an auto discovery capable client from there as well. All right, so that’s about it. Let’s continue to the next lecture.

14. Memcached scaling

Now, let’s talk about scaling in memcache d. So, memcache d does not support vertical scaling. So, if you want to scale your instance or resize your instance, you can resize by creating a new cluster and then migrating your application. So you create a different cluster with a new size that you want to resize your cluster to and then migrate your application or point your application to the new cluster that you created. So, this is the workaround you can use to scale your memcache d cluster, then talking about the horizontal scaling. So, horizontal scaling allows you to partition your data across multiple nodes. So this simply means that you can add or remove nodes from your mem cache d cluster.

The maximum that memcache d cluster supports is about 20 nodes per cluster, and you can have up to 100 nodes per region. Of course, this is a soft limit. And when you add or remove nodes, you don’t have to change your endpoints post scaling if you are using auto discovery. And we have already seen what auto discovery is, right? And one thing that you should keep in mind is you have to remap at least some of your key spaces post a scaling operation. That is, you have to evenly spread your cache keys across different nodes when you add or remove nodes from your mem cache d cluster. All right, so that’s about it. Let’s go into a demo and create a memcache HD cluster.

15. Creating a Memcached cluster – Hands on

In this demo, let’s create a memcache d cluster. I’m here in the ElastiCache dashboard, so click on memcache D and create a new cluster, which is memcache D option for the cluster engine. Then provide a name, let’s say memcache cluster cluster. And the port is 11211. We can go with the default parameter group for the node type. We can choose the smallest available one, as this is just a demo. Under a number of nodes, you can choose up to 20 nodes. Let’s go with about three nodes for now. And under advanced memcache D settings, we can provide the networking settings. So we can choose a subnet group. So you can provide an existing one or create new one.

So I’m going to stick to the ready subnet group that we created earlier. And you can also choose a security group here. I’m using the default one. Just make sure that the security group that you choose here provides inbound access on the port eleven to eleven or 11211. All right. Then you can provide maintenance window settings here. You can subscribe to SNS notifications by providing an SNS topic here. And with that, we can create our cluster. And now it’s going to take a few minutes for the cluster to become available. So I’m going to pause the video here and come back once it’s available. All right, now we can see that our Memcache D cluster is available.

Let’s expand it. And we can see that we have about three nodes. We have our endpoint here. Let’s click on the cluster name. And here we can see that we have about three nodes. So data will be distributed across these three nodes. All right. And if you want to add additional nodes, you can add them from here. If you want to remove a node, you can remove a node from here. Okay. And you can see the monitoring statistics or the Cloud Watch metrics here. Now, we just created this cluster and it is not in use. So of course there are no metrics here. But you can definitely monitor your clusters from here.

All right, so this will show you the data of all different nodes. And similarly, if we go back to our redis cluster here, we can see the statistics or the metrics. It’s been a while since we created this redis clusters. Some metrics have been generated here. All right, this is the way you monitor your ElastiCache clusters, be it redis or memcache T. All right, so that’s about it. And since we are done with the demos, I’m going to delete these clusters. All right, select the cluster and choose the option to delete. We don’t need a backup, and similarly for the memcache D cluster as well, I’m going to delete it. All right, so that’s about it. Let’s continue to the next lecture.

16. Choosing between Redis and Memcached

Now, let’s talk about choosing between Redis and memcache T. So let’s compare Redis and memcache T side by side. So we have Redis and Memcache T. Both of them provide a submillisecond latency or microsecond latency. Redis supports complex data types like sorted sets, hashes, bitmaps, hyperlog, log, geospatial indexes, and so on. And we have seen some of the use cases of Redis that use sorted sets and hashes. And on the other hand, Memcache T only supports simple data types like string or objects. Redis supports multiaz with auto failover, and it also supports sharding.

So sharding is used for distributing data across different shards or different nodes. And similarly, Memcache D supports multiple nodes for sharding purpose. So you can have multiple nodes up to 20 nodes in Memcache D to distribute your data. And with Redis, in addition to shards, you can also have read replicas within each shard. So read replicas are used for rescaling as well as for high availability. So redis supports read replicas. Memcache D does not support read replicas, but it does support sharding. Then Redis provides data durability using what is called as AOF persistence.

AOF stands for append only file, and it’s the change log style persistent format that’s specific to Redis. Then Redis also supports backup and restore features. And on the other hand, Memcache D does not support persistence, it’s a non persistent cache, and also it does not support any backup restore features. But what it does support is a multi threaded architecture. So Memcache D offers a multi threaded architecture, whereas Redis not support multithreading. All right, so that’s about the comparison of the two. Let’s continue to the next lecture.

17. ElastiCache security

Let’s talk about ElastiCache security. First, the encryption. So Memcache D does not support encryption, but Redis does support it. So Redis supports encryption at Rest as well as encryption in transit. So encryption at Rest for Redis is implemented using Kms, whereas encryption in transit for Redis uses SSL or TLS. So you can encrypt your data in transit between a server and a client. And remember that this is an optional feature, and it can have some performance impact if you use encryption in transit with your Redis cluster. And you can also encrypt connections between your replicas. So, Replication also supports encryption in transit. And the Redis snapshots that are stored in S Three use the S Three’s encryption capabilities.

So when you create your snapshots and move them to S Three, the process also uses encryption in transit. All right? So that’s about encryption. Now let’s talk about authentication and access control. So, authentication of your ElastiCache cluster has a couple of layers. So with Redis, you can use Redis Auth for authentication so your server can authenticate the clients. And this requires SSL or TLS to be enabled. And the way you enable Redis Auth is you simply modify your cluster or your replication group and add the authentication token or Auth token. So Auth token is simply a password that you can use to connect to your Redis cluster. So you modify your cluster or the replication group and provide a password.

And additionally, you can also provide the token update strategy so you can rotate your credentials on a periodic basis if you want. And to connect with Redis Auth, you simply provide this password when you connect to your Redis cluster. Then you also have server authentication with Redis so clients can authenticate that they are connecting to the right server. Further. Let’s talk about IAM. So, Im policies can be used for API level security, like creating cache, updating cache, et cetera. And remember that ElastiCache does not support IAM permissions for actions within ElastiCache. For example, you cannot use Im to control which clients can access what right. So that’s about Im. Now let’s talk about the network security with ElastiCache.

So, it’s always recommended to use private subnets within your VPC to set up your ElastiCache clusters, and you control the network access using VPC security groups. All right? And for Redis, your security group should allow inbound access on the Redis port, which is 6379. And for Memcache D, the port is 11211. So if you’re using Memcache D, your security group should allow inbound connections on the port 11211. And there’s also something called as ElastiCache Security Groups that you can use if your ElastiCache clusters are outside the Amazon VPC. ElastiCache Security groups allow you to control access to your ElastiCache clusters that are running outside the VPC. And for clusters running within your VPC, you simply use the VPC security groups as usual. All right? So that’s about the ElastiCache security. Let’s continue to the next lecture.

18. ElastiCache logging and monitoring

ElastiCache logging and monitoring. And as usual, this is provided by Cloud Watch service. So you have a couple of host level metrics like CPU Memory Network and so on. And you also can monitor the Redis metrics like the Replication Lag Engine, CPU Utilization, or the metrics that you get from the Redis Info Command. And as is the case with any other service, cloud Watch provides up to 60 seconds of granularity. Further, you also have what is called as ElastiCache events, and you can very well guess that this must be integrated with SNS. So the logs of events related to your cluster instances, security groups, parameter groups can be relayed on to SNS as events. And you can subscribe to that. And you can set up these events through the ElastiCache Console itself. And just like any other AWS service, all the API calls are logged with Cloud Trail. All right, so that’s about logging and monitoring in ElastiCache. Let’s continue to the next lecture.

19. ElastiCache pricing

Let’s talk about the ElastiCache pricing. So ElastiCache is priced on per node hour basis, so you pay for the node hours consumed by your nodes. Partial node hours are billed as full note hours, and you can use reserve node for upfront discounts. So you pay upfront for one year or three years to get discounts on the standard node hour pricing. And of course, you pay for the data transfer. You don’t pay anything for the data transfer between your EC to instance and ElastiCache within AZ, but all other data transfers are chargeable. You also pay for backup storage for automated as well as for manual snapshots on a per GB per month basis. And space for one snapshot is complementary for each active redis cluster. So that’s all about ElastiCache. And that’s the end of this section. Let’s continue to explore another database in the next section.

Uncategorized

Related posts:

Leave a Reply Cancel reply