In Oracle Database 23ai, the Oracle RAC two-stage rolling patch provides a framework where such patches that include data dictionary changes can be applied in a rolling fashion and enabled after the patch has been applied to the last instance without requiring downtime to apply a patch. This feature splits the patching operation between applying binary changes at the software level and SQL changes at the database level. During phase 1, patch is applied to all instances, but the fix is not enabled. When you look at the example in the slide, the example shows a full node RAC database, where 23.3.1 software version is installed across all four nodes. When you apply patches with Oracle RAC two-stage rolling patch strategy, we apply patch one node at a time. While patches are being applied in a node, all the other instances running in the remaining servers still can access the Oracle Database because patch will be applied. However, until patch is applied in the last node, the fix is not enabled, thus users can access the database. So we’re going to apply patch in the first node first. So users disconnected and user can reconnect to one of the three surviving and remaining servers. And once the patch is applied, the user can reconnect to database instance running on instance 1. And by doing so we can apply patches at software level in the second server and also third server and the first server. On completion of phase 1, fix is enabled through SQL statement. So we’re going to run alter system enable rac two_stage rolling update all. So before running this command, the binary that is activated is a 23.3.1 even after the patch is applied. But after you run all the system enable rac two_stage rolling update all, this is the time when the fix is enabled. So this software version is updated. So this feature helps to reduce the planned downtime. Reducing the need to take the database instance down also improves the performance as the workloads are not subject to rewarming the cache after instance restart. So this is a nice feature. The feature significantly reduces the number of nonrolling patches.
Continue reading...December 11, 2024
Local Rolling Database Maintenance in 23ai….
Local Rolling Database Maintenance. Starting with Oracle Database 23ai, you can apply rolling patches locally for Oracle Real Application Clusters and Oracle RAC One Node deployment. It’s very similar to single-server rolling database maintenance, but this feature is used for the multinode RAC environment. So let’s take a look at how it works. So let’s assume a two-node RAC database example. So we have host A and host B, and CDB1 instance running out of host A, CDC2 instance running out of host B. When we perform our place patching, we install the software in a new home and then we apply patch. Once patch operation is complete with the local rolling database maintenance, we can start new home from a new instance out of a new home, while the original instance is still running out of original home. So at one point, on host A, we’re going to have two instances, one from original home and the other instance out of a new home. Once everything is ready, services and instances, the services and sessions are moved to new instance running out of a new home. And once the sessions are moved, then the original instance is stopped. And same thing happened on host B. So we install the new version of the software or we install or patch the software in a new home and then start a new instance out of a new home, in this example, CDB2/4. And then sessions are moved to new home, the original instance is stopped. Local database maintenance provides uninterrupted database availability during maintenance activities such as patching for Oracle RAC and Oracle RAC One Node databases. This significantly improves the availability of your databases without causing extra workload on other cluster nodes. So let’s take a look at examples. First, you download Oracle Database installation image file and extract the image file into new Oracle home directory. And from the new Oracle home directory, start OUI and apply required release update. And then perform software installation. So it is installing the patched software in a new home. And then we’re going to run SRVCTL modify database command with a -localrolling option. This is to enable local rolling to create a new rec instance. So as soon as you run this command, new instances are created but stopped. For example, if you have a two-node rec database in the first node, the new instance is created but stopped. In the second node, a new instance is created but stopped. And then we’re going to transfer Oracle RAC and Oracle RAC One Node database and PDBs and services from the old Oracle home to the new Oracle home. And this is the step to start the instances out of a new home and then transfer services, the PDBs, the services to new instances and then stop original instances and that’s by SRVCTL transfer instance. And now you’re going to verify database configuration changes. And the output of a server control configure database command, it should show new instance names for the database.
Continue reading...Smooth Reconfiguration of Oracle RAC Instancesi in 23ai….
Servers leaving or joining a cluster resulting a reconfiguration that is essentially a synchronization event to recover all the changes made by the failed instance. Oracle RAC has reduced the time, the sessions wait on this event during reconfiguration. In Oracle RAC 23ai, smart reconfiguration reduces the impact of a service disruption from both planned and unplanned operations, utilizing several features, such as Recovery Buddy and PDB and service isolation and smooth reconfiguration, resulting in a faster reconfiguration than previous releases. So we’re going to take a look at some of the features introduced in previous releases, and then we’re going to get into smooth reconfiguration that was introduced in 23ai. Let’s review global resource management. One user makes a connection to one of the RAC instances, and user can submit SQL statement requesting a set of blocks. In order to catch database blocks, the buffers must be allocated and also master metadata must be allocated to be able to describe changes to those buffers. An internal algorithm is used to decide which instance should contain the master metadata structure for that entity. In our example, the master metadata structure are distributed across instance 1 and instance 2. During the instance startup, this information on the master metadata structure for the entity has persisted in the data dictionary and reused during instance startup. And also the global resources are managed for unplanned instance crash and also planned service relocation as well. Now, let’s review PDB and service isolation. Let’s assume, there are three PDBs– PDB1, PDB2, PDB3. PDB1 is running on instance 1 and PDB2 is running on instance 1, instance 2, and instance 3. And also PDB3, it is available in instance 2 and instance 3. So when you make any changes to PDB1, then the metadata structure owned by PDB1 is only available in instance 1. When you make any changes to PDB3, the master metadata structure for PDB3 will be distributed across the instances where PDB3 is up and running. In this example, in instance 2 and instance 3. So let’s take a look at RAC reconfiguration. The PDB in RAC embodiment is reconfigured only needed if PDB1 is open on instance 2 as an example. So, for example, originally, PDB1 was available on instance 1. When you start PDB1 even in instance 2, the master metadata is redistributed across instance 1 and instance 2. And also if CDB 2 goes up, then what will happen? All the PDBs that were running out of instance 2 are no longer available. So the master metadata that was kept in instance 2 must be redistributed across surviving instances, in this example, instance 1 and instance 2. And impact is isolated to the affected PDBs only if the PDB is unaffected when CDB instance crashes. And also PDB1 is open on instance 2, or fourth instance is brought up. So these cases, the impact is only isolated at the PDB level. The PDB and service isolation. This is a feature that is used for CDB, and this is an enhancement from the service-oriented buffer cache access. And this feature improves performance by reducing distributed lock manager operations for services not offered in all PDB instances. The next topic is Buddy Recovery for reconfiguration. The Recovery Buddy feature is a feature that reduces the waiting time during reconfiguration. In prior releases, Oracle RAC instances identified and recovered the changes made by the failed instance by reading the redo logs. So, for example, for instance 1, PDB1 goes down. In order to recover blocks, the heavily modified on PDB1, one of the surviving instances must access the redo log file owned by PROD 1 and then identify blocks to be recovered. So that’s in a physical I/O. And it is a time-consuming operation. With the Recovery Buddy feature, we can reduce this I/O because of in-memory log and also because of Recovery Buddy concept. So, for example, in the three-node direct database, we actually assign the Buddy Recovery for each instance. For example, PDB1 is a Recovery Buddy of PROD 3, and also PROD 2 is a Recovery Buddy of a PROD 1, PROD 3 is a Recovery Buddy of PROD 2. So which means that when you make any changes to the blocks in instance 1, PROD 1, then changes will be captured directly in PROD 1 but also the same changes will be maintained in the Recovery Buddy memory. So in-memory log. And same thing– when you make any changes to PROD 2, this change will be maintained not only locally but also in the Buddy Recovery instance as well. So here’s an example. So we connect to instance 1 and request the blocks like this and then make changes. So we make changes like this. Now, when you make any changes to PROD 1, these changes are maintained in PROD 1 but also the same change is maintained in the Recovery Buddy instance. So if PROD 1 goes down, instead of having access to online redo log file owned by Prod 1, we can directly access to in-memory log preserved in the Recovery Buddy instance and then read it to identify blocks to be recovered. So once we identify blocks to be recovered and apply changes, then it recover. So this feature reduces the time required for reconfiguration. Smooth reconfiguration. Smooth reconfiguration of Oracle Real Application Clusters instance reduces brownout time during cluster reconfiguration. So here’s an example. Suppose that you run srvctl command to stop instance, in the previous version, as soon as you run srvctl stop instance command, your instance is just stopped. And also until the metadata that was kept in the stop instance is redistributed, your database was frozen for that amount of time until global resources are recovered. However, in 23ai, we changed the algorithm slightly. So you request a stop instance. However, instead of stopping instance immediately, we perform the resource remastering operation first. So we distribute the metadata before performing stopping of an instance. So after redistributing instance, then we’re going to actually shut down instance. So when you actually look at the differences between version 19c, for example, and then 23ai, we slightly change the order, and that reduced the time required for the cluster reconfiguration. So in 19c, as soon as you issue stop instance command, srvctl stop instance, your instance must be killed and stopped. And then the global resource must be remastered. So for a short amount of time, your database wasn’t able to perform any activities. However, in 23ai, when user requests the srvctl stop instance, instead of stopping an instance, we remaster the resource first. And then after the resources are remastered, then we could actually shut down instance. That reduces reconfiguration time. So this feature it distributes resource coordinator. So resource coordinator is same as the master resource, the owner, or the resource master. So same terminology. So we call it now resource coordinator, before shutting down instances for planned maintenance....
Continue reading...Extent-based Scrubbing in 23ai….
What is Oracle ASM scrubbing? Oracle ASM scrubbing is a feature that enhances availability and reliability by seeking out rarely accessed the data and fixing logical corruptions automatically. You can perform on-demand based scrubbing by using altered disk group scrub command. And you can run this command at ASM disk group level by running all the disk group and scrub option. And it is also possible to scrub at the disk level. The second command you see the scrub disk and the disk name to specify the disk– will be scanned to identify logical corruptions. And it’s also possible to scrub at the file level, like a third example. All the disk group data scrub file, and then this is to scrub the specified file. And we can also add additional options along with the scrub option. If you want to stop ongoing scrub operation, you can run all the disk group with a scrub stop option. And this feature was available prior to 23ai, so what you knew in 23ai. With the introduction of Oracle Database 23ai is now possible to scrub individual extents or a range of extent. Previously, ASM scrubbing was only available on file levels or disk level and disk group level. When compared to the scrubbing the entire file in 23ai, you can specify extent. You can scrub specific extent set to reduce scrubbing turnaround time. It improves the data availability and also minimizes the performance impact. It’s possible to perform the extent-based scrubbing by using same command alter disk group scrub, but with the additional option. So when you look at the first command all to disk group scrub file and block number 10 and count to three. So what it does is this command identifies block number 10. And from that, check the three blocks. So block number 10 and 11 and 12. And identify extent that contain data blocks and then scrub. So the extent-based scrubbing instead of scrubbing entire file, we can specify specific blocks. Then the command automatically identify extent that contain blocks you specified. So it will dramatically reduce the turnaround time. It will also possible to scrub multiple groups of a data blocks. So you can add a block and count the combination multiple times. So in the example, in the second bullet, we specify the block number 50 and count to 10 and block number 1,024 count to 70, meaning that identify block number 50 and then count 10 blocks from that. And then identify block number 1,024 and identify 70 blocks from the block number 1,024. And then once blocks are identified and check the extent that contain data blocks. So extent-based scrubbing will be performed based on those extent that contain blocks you specified. So it dramatically improves the performance.
Continue reading...Cluster Health Monitor Enhancements in 23ai….
Let’s take a look at cluster health to monitor enhancement in Oracle database 23ai. Autonomous Health Framework Repository is a repository that stores the information collected by various components such as Cluster Health Monitor, Cluster Health Advisor, Fleet Patching and Provisioning, and Oracle Clusterware. So all these components collect information and store that information in a repository and that is called the autonomous health framework repository. There is a changes to this repository, starting with the Oracle Database 23ai, the use of Grid Infrastructure Management Repository, which we used to call GIMR is de-supported. Instead in Oracle Database 23ai uses a directory on the local file system, and that is one change to this area. And let’s take a look at related information. So review of a cluster health monitor and cluster health advisor. And these are the components of a great infrastructure. When you look at cluster health monitor first– and this component persists the collected operating system metrics on their directory in Oracle base, which is a metric repository. And this repository is auto-managed on the local file system. And you can change the location and size of this repository. And also, know the view samples are continuously written to this repository, not in the GIMR, but the local file system-based the repository. And the data is saved in the JSON format. And historical data is auto-archived into hourly zip files and also archive the files are automatically purged once the default retention limit, which is the 200 megabyte is reached. So one thing that you have to know in 23ai GIMR is de-supported instead, local file system repository is used and that repository is auto-managed. Let’s take a look at another component, Cluster Health Advisor. And this component continuously monitor cluster nodes and RAC databases for performance and availability issues to provide early warnings of a problems before they become critical. You can think about this as an ADDM, ADDM in the database level. So it collects database information along with all the OS metrics information and analyze it and give you recommendations. Oracle Cluster Health Advisor support the monitoring of a two critical subsystems of Oracle Real Application Cluster. First, Oracle Database. Second, the database hosts the system. Now, especially in 23ai, CHA Cluster Health Monitoring, Cluster Health Advisor, it can monitor not only container database, but also pluggable database as well. So we can leverage the PDB level data as well to get better idea and better information. And all analysis in research and diagnostics, and corrective actions, and metric evidence, these are all stored in the file system-based repository. OK. So another new features in Oracle Database 23ai. Cluster Health Monitor introduces a new diagnostic feature that identifies critical component events that indicate pending or actual failures and also provide recommendations for corrective actions. For example, RDBMS or GIPC and CSS, and all the other components as well that running in the same cluster. You can the generate event that indicate any type of failures. And once event that describes failures created, and they can be sent over to Cluster Health Monitor. Prior to Oracle Database 23ai, CHM was responsible to collect information up to that point, especially OS metrics. In 23ai in addition to OS metrics, CHM also can receive the event sent by various component, like a RDBMS and also CSS, GIPC, and so on. And also in addition, CHM can work with the new component, which called the CHM diagnostic component. So it can ask a CHM diagnostic component to review event and then make a recommendations. And when possible, take action as well. And these are all the enhanced in the diagnostic area. So if something goes bad in the cluster and the cluster component can create event to describe this failure, CHM can receive this event and work with the cluster health to monitor diagnostic component to generate the recommendations. And when possible, to take action. So all in actions and recommendations stores in the file system-based repository. And also, admins are notified that through the component such as Oracle Trace File Analyzer. Improving robustness and reliability of Oracle Database hosting infrastructure is a critical business requirement for enterprises. This improved the ability to detect and correct at the first failure, and the self-healing autonomously delivers value by improving business continuity. So that’s a big improvement in 23ai. Now, let’s take a look at new Oracle cluster monitor command that is related to the new diagnostic component. The first command is the Oracle cluster monitor CHM diagnostic description. And this is to get a detailed description of all supported events and actions. And we can also run Oracle cluster monitor CHM diagnostic with a query option. This is the query of the CHM diagnostic event actions sent by various component and generate an HTML or text report. It’s also possible to run Oracle cluster monitor CHM diagnostic collect. And then you add additional options like a last 1.5 hours and then out directory. So where to create the output. So this is to collect all event actions data generated by CHM diagnostics into the specified output directory location.
Continue reading...
Recent Comments