THUD - Troubleshooting Hyperion Under Duress: 2016

December 30, 2016

fdmContext codes

When creating scripts for FDMEE you use the fdmContext dictionary to find information about the currently executing rule. For example to find the period name you would use code something like:

perName = fdmContext("PERIODNAME")

The ability to query the context of the current rule provides a lot of power and flexibilty in your Jython scripts. But the keys for the fdmContext dictionary are not well documented. If you search the web by binggoogling you only see a handful of the possible values scattered across various blog pages.

Francisco Amores has an excellent blog that focuses on FDMEE. This post shows sample code to extract all of the keys and values from the dictionary. I used this in the BefImport script to dump the values to the debug log at the start of processing. Be sure to set the log level to 5 so the debug information gets logged. You can also use the Jython file IO functions to open, write, and close a file to export the list to any convenient location.

The goal for this post is to show all the possible context codes and describe their use and origin. For brevity's sake I will refer to the workspace Navigate > Administer > Data Management tab as FDMEE in the following descriptions.

Directory keys

APPROOTDIR	The root directory for the FDMEE application. Inbox, outbox, data, and scripts directorys are below this root. This is defined in the System Settings section of FDMEE
BATCHSCRIPTDIR	The directory on the FDMEE server that is created at install and holds utility scripts including encryptpassword, importmapping, loaddata, loadmetada, regeneratescenario, runbatch, and runreport.
EPMORACLEHOME	The EPMSystem11R1 directory
EPMORACLEINSTANCEHOME	The EPM instance inside the user_projects directory. These locations are defined during the install and configuration of FDMEE.
INBOXDIR	The default location for importing data files
OUTBOXDIR	The default location for the .dat, .err, and .drl files created during data loads. Also has the logs directory which has the status logs that can be handy for troubleshooting errors.
SCRIPTSDIR	Where the scripts are saved. The last three locations are created when you click the Create Application Folders button under System Settings in FDMEE.

Source Keys
These keys reference the AIF_SOURCE_SYSTEMS table in the FDMEE database and most of the values are defined in Data Management > Setup > Source System.

SOURCEID	The SOURCE_SYSTEM_ID field which is a sequential index value
SOURCENAME	The name of the source system.
SOURCETYPE	The type of the source system such as File, SAP, MySQL, etc.

Target keys
These keys reference the AIF_TARGET_APPLICATIONS table in the FDMEE database with most of the values defined in Data Management > Setup > Target Application.

APPID	The ID of the target application. This is the sequential integer APPLICATION_ID key in the AIF_TARGET_APPLICATIONS table
TARGETAPPDB	The database for the target application if loading to Essbase
TARGETAPPNAME	The name of the target application.
TARGETAPPTYPE	The type of application such as Planning, Essbase, Financial Management, etc.

POV Keys
These are values defined in the Point of View on the Workflow tab.

CATKEY	The CATKEY field in the TPOVCATEGORY table. A sequential primary key for the table
CATNAME	The CATNAME field names the category as defined in the Category Mapping section of FDMEE.
LOCKEY	The PARTITIONKEY field in the TPOVPARTITION table. This is a sequential integer that is the primary key for the table
LOCNAME	The Name as defined in the Location section of FDMEE. It is held in the PARTNAME field in the TPOVPARTITION table.
PERIODKEY	The full date of the period in the POV. This is the PERIODKEY field which is the primary key in the TPOVPERIOD.
PERIODNAME	The Period Name as defined in the Period Mapping section of FDMEE. This is the PERIODDESC field in the TPOVPERIOD table.
RULEID	The RULE_ID in the AIF_BALANCE_RULES table. It is a sequential integer that is the primary key for the table
RULENAME	The name of the rule as defined in the Workflow > Data Load Rule section of FDMEE. It is held in the RULE_NAME field in the AIF_BALANCE_RULES table.

Data Load keys

These values are generated when the load rule runs. The values come from the AIF_BAL_RULE_LOADS table.

LOADID	The LOADID field which is a sequential primary key in the AIF_BAL_RULE_LOADS table. For loads that don't come from text files this is the p_period_id variable in ODI.
EXPORTFLAG	The value is Y or N depending on whether or not we are exporting to the target. The value is in the EXPORT_TO_TARGET_FLAG field.
EXPORTMODE	This is the type of export which can be Replace, Accumulate, Merge, or Replace By Security. The value is in the EXPORT_MODE field
FILEDIR	When loading from a file this is the directory beneath APPROOTDIR where it is found. When not doing a file load this value is None and there is no FILENAME key. The value is in the FILE_PATH field
FILENAME	The name of the data file in FILEDIR and held in the FILE_NAME_STATIC field. The full path to the data load file would be APPROOTDIR/FILEDIR/FILENAME
IMPORTFLAG	The values is Y or N depending on whether or not we are importing from the source. This might be N if we are working in the Data Load Workbench and did and import, then corrected some validations, then as a follow up step we did the Export. The value is in the IMPORT_FROM_SOURCE_FLAG
IMPORTFORMAT	This is the name of the Import Format configured in the Setup tab of FDMEE. The value is in the IMPGROUPKEY field.
IMPORTMODE	I have no idea what this field is or what it does. In all my testing this field had the value of None. Maybe this is a place holder for something planned in the future?
MULTIPERIODLOAD	This is Y if you are loading more than one period. You would do this from the dialog displayed when you click the Execute button on the Data Load Rule tab. There isn't a field for this in the AIF_BAL_RULE_LOADS table so I suspect it is set to Y if START_PERIODKEY is not equal to END_PERIOD_KEY
USERLOCALE	The locale for the current user. This is used to translate prompts into other languages. I can't find this in any of the tables so I suspect it is pulled directly from workspace.
USERNAME	The username of the person running the data load. This isn't held in any of the AIF_BAL_RULE_xxx tables so this may be pulled from workspace.

This is most of the information but I am missing a few things such as the source of the USERNAME key. If you have information to add please replay in the comments below to assist your fellow travelers.

October 24, 2016

Truncating the Job Console logs

In the last post we talked about how to automate deleting the task flow logs. Another place where log entries accumulate is the Library Job Console.

The Job Console log is more manageable than the Task Flow log. In the Job Console tab you can display up to 200 entries per page and you can click the first record, scroll to the bottom of the list, shift+click to select all the entries, then right-click and delete.

This is better but can still be a bunch of work if you haven't cleared the logs in a while. And wouldn't you rather have an automated solution that can be scheduled so you can work on other things? Of course you would.

Analysis

The discovery was pretty easy. In the EPMA database the table name [JM_Job] is a dead give away. The contents show the same information we see in the Job Console tab.

There are three tables with the JM* prefix which hold the job log data: [JM_Job], [JM_Batch], and [JM_Attachment]. They are connected by the i_job_id key which has sequential values and is the primary key in the [JM_Job] table. The [JM_Attachment] primary key combines the i_job_id and the i_attachment_id, where i_attachment_id is also a sequential value. The primary key for the [JM_Batch] database combines the foreign keys i_job_id and i_attachment_id with the field i_batch_id.

The [JM_Job] database has the start and completion times for the various jobs. So we can reference that to find our cut off date, get the i_job_id for the previous job, and delete from the three tables anything that is less than the cut off i_job_id.

Solution

The wrinkle is that because of foreign key constraints of i_job_id the [JM_Job] database has to be the last one truncated. And because the i_attachment_id is a foreign key in [JM_Batch] that has to be the first table truncated.

Like with the task flow truncation we can either calculate or specify the cut off date.

DECLARE @CutOff datetime;

SET     @CutOff = DATEADD(M,-4,GetDate());

--DECLARE @CutOff char(10);

--SET     @CutOff = '2016-01-01'

There are several ways to tackle getting the i_job_id for our cut off date. One approach is to assign the i_job_id to a variable and use that in the DELETE statement:

DECLARE @ID Int;

SET @ID = (SELECT Top 1 i_job_id FROM JM_Job WHERE d_started < @CutOff ORDER by i_job_id DESC)

DELETE FROM JM_Batch WHERE i_job_id <= @ID

But these are fairly small tables so it is easy to use a subselect in the DELETE:

DELETE from JM_Batch  

 where i_job_id in (

  SELECT i_job_id

  FROM JM_Job

  WHERE d_started < @CutOff)

The last piece is to arrange the DELETEs in the correct order. The complete TSQL script is:

use EPMA

go

-- Anything prior to cutoff date will be deleted

--   The DateAdd function can change depending on need

--

DECLARE @CutOff datetime;

SET     @CutOff = DATEADD(M,-4,GetDate());

--DECLARE @CutOff char(10);

--SET     @CutOff = '2015-01-01'

PRINT 'Cut off date = '+convert(VarChar(32),@CutOff)

-- Because of the key constraints the deletes need to happen in this order

--   JM_Batch has to be first because it has keys from JM_Attachment and JM_Job

--   JM_Attachment has key for JM_Job so deletes have to happen before

--     JM_Job but after JM_Batch

--   JM_Job has to be last because that has the date field we query against

--

-- For the first 2 deletes we get a list of i_job_id from JM_Job that are older

--   than the cutoff date, and compare the i_job_id in the table to that list

--

DELETE from JM_Batch  

 where i_job_id in (

  SELECT i_job_id

  FROM JM_Job

  WHERE d_started < @CutOff)

DELETE from JM_Attachment

 where i_job_id in (

  SELECT i_job_id

  FROM JM_Job

  WHERE d_started < @CutOff)

DELETE from JM_Job

 where d_started < @CutOff

This gives us a tidy bit of code that we can run on a regular basis to truncate the Library Job Console log to purge stale information.

This blog post is a bit short, so here's a picture of a kitty.

August 30, 2016

Automate Deleting Taskflow Log

We use HFM taskflows. A lot. When a taskflow runs it generates log information so you can check on the status of the execution and examine any errors to effect remediation.

But like the EPMA logs, these log entries stay around forever until you delete them. On the Task Flow Status page you see each task flow status on a line with a check box. To delete a log entry you click the check box to select the entry and click the Delete button. Child's play.

Since these are check boxes you would think, as a standard GUI practice, you could check multiple boxes and click the delete button to delete multiple log records. You would be wrong. You have to select the log records one at a time to delete them. Isn't that convenient?

So if you don't constantly do maintenance you wind up with a bunch of useless information in your log viewer. Searches take longer and if you want to delete entries it takes a lot of time to complete. What's a mother to do?

Analysis

We use the taskflows to monkey with our HFM applications. Taskflows are found in the Consolidation Administration tab so you might think the information for taskflows is in the HFM table. But we can also do EPMA related things like run data synchronizations and redploy applications. So where do we find the tables that manage the taskflows and logs? In the Shared Services database, of course.

It took me some digging to figure that out, but the first clue was finding a table named [ces_wf_instances]. SELECTing from that tables shows the field process_name with the names of my taskflows. Other useful fields include starttime, a datetime field that we will use to find all the old log entries, and workflow_id and taskid which are used as references to other tables. workflow_id is the taskid with the string "wf-" as a prefix. workflow_id is also part of the primary key so it is unique across the tables.

The nice thing is that taskid values are always in ascending order. I don't have any way to prove this but I suspect that taskid is based at least in part the datetime of the activity. This means that when we process the other tables we can define a cutoff date, find the first taskid in the [ces_wf_instances] for that date, then delete anything that has a taskid less than that. The not so nice thing is that in all the other tables the taskid field is named task_id.

What other tables will we process? There are 13 tables with the ces_ prefix but some of them are empty and others are unrelated to logging. The [ces_tasks] table lists tasks within the taskflow, [ces_messages] contains more detailed messages about the task. Both tables contain a field named objectvalue which is an image blob. This is a large hex string that gets processed to show its part of the workflow status. Both of these tables are referenced by task_id which, like in [ces_wf_instances], is sequential, so we can still use the strategy of deleting records less than the target task_id.

There are also the [ces_participant] and [ces_participant_events] tables which have details on the steps in the taskflow. Like the [ces_tasks] and [ces_messages] tables these tables have the objectvalue image blob. The [ces_participant] is referenced by workflow_id which matches the field in the [ces_wf_instances] table. There is also a field named participant_id which contains the workflow_id, the user, and the name of the step in the taskflow. This participant_id is a foreign key in the [ces_participant_events] log. Because there is a foreign key constraint between the tables when we delete records from the [ces_participant] table the records with the matching participant_id in the [ces_participant_events] table also get deleted.

Solution

So now that we know all the players, let's draw up a play to get us to the end zone. We will build a SQL script that defines our cutoff date, find the last taskid associated with that date, then delete records from the other tables where the taskid is less than the one for the target date.

There are different tactics to define the cut off date. If we are running this process manually maybe we just want to define the date with a static string:

DECLARE @PriorDate DateTime;

SET     @PriorDate = '2015-01-01';

But if we want to create a scheduled process we can calculate the desired cut off date. For example, the following code will use the DATEADD() function to set the cut off date as 3 months prior to the current date:

DECLARE @PriorDate DateTime;

SET     @PriorDate = DATEADD(M,-3,GetDate());

Now that we have the date, we need to find the target taskid. The taskid is sorted sequentially so we take the TOP 1 taskid where the starttime is greater than or equal the cut off date. Note that we use greater than or equal to (>=) rather than just equal to (=) in case there was no taskflow execution on the cut off date. This gives us a taskid where anything less will be prior to our cutoff date. Also note that we make the @LastID variable nvarchar(100) which matches the data type used in the [ces_wf_instances] table.

DECLARE @LastID nvarchar(100);

SET     @LastID = (   

  SELECT TOP 1 taskid

    FROM [ces_wf_instances]

    WHERE starttime >= @PriorDate

   )

For the [ces_messages] and [ces_tasks] tables we can do a simple delete where task_id is less than our target taskid. But for the [ces_participant] table we need to get the workflow_id for all of the records prior to our cut off date. One way to do this is to use the IN clause and build a list of workflow_id from the [ces_wf_instances] table where the starttime is less than the cut off date.

DELETE FROM [ces_participant]

 WHERE workflow_id IN (

  select workflow_id

   from [ces_wf_instances]

   where starttime < @PriorDate)

Remember that because of the foreign key constraint deleting records from [ces_participant] will also delete the associated records from the [ces_participant_events] table. Also note that because we need the [ces_wf_instances] table to do this delete we have to make that the last table from which we delete records.

The entire SQL script is:

-- Cut off date

DECLARE @PriorDate DateTime;

SET     @PriorDate = DATEADD(M,-3,GetDate());

--SET     @PriorDate = '2015-01-01'

PRINT 'Cut off date = '+convert(VarChar(32),@PriorDate)

-- Get the first datskid of the cutoff date

DECLARE @LastID nvarchar(100);

SET     @LastID = (

 select top 1 taskid

  from [ces_wf_instances]

  where starttime >= @PriorDate

   )

PRINT 'Last TaskID = '+@LastID 

-- These tables are simple deletes

DELETE from [ces_messages] where task_id < @LastID

DELETE from [ces_tasks]    where task_id < @LastID

-- [ces_participant] table has a foreign key constraint against [ces_participant_event]

-- table.  Deleting records from [ces_participant] deletes records with the same

-- particpant_id from the [ces_participant_event].  So to process these two tables we

-- get a list of workflow_ids where starttime is less than @PriorDate

DELETE FROM [ces_participant]

 WHERE workflow_id IN (

  select workflow_id

   from [ces_wf_instances]

   where starttime < @PriorDate

   )

-- We could also have deleted based on task_id   

DELETE FROM [ces_wf_instances]

 WHERE starttime < @PriorDate

This gives us a fairly simple SQL script we can schedule as a SQL agent job or any other means appropriate for the environment. No manual steps, no muss, no fuss, and we keep the taskflow log and the underlying tables lean and manageable.

August 3, 2016

The Case of the Missing Applications

Patching Oracle EPM products can be challenging. Things have gotten better with newer versions but there are still wrinkles that have to be ironed out on occasion.

https://ww.deluxe.com/blog/using-facebook-insights-small-business/

In this case we were applying the latest Patch Set Updates (PSU) for our Financial Reporting/Reporting and Analysis/BIPlus applications. One of the wrinkles with PSUs, particularly those for BIPlus, is that you have to redeploy the application with the EPM System Configurator/Config Tool. One of the other wrinkles with EPM products is that they are frequently bolted on to the rest of the suite and will go by many names when you try to research a problem.

After successfully running OPatch to update the product I ran configtool.bat. But the list of available products to configure only showed the basic Foundation Services. None of the other products, including the Reporting and Analysis Framework that we just patched were available.

Fortunately this was in our QA environment so there was no chance of causing a production outage. Also fortunately, and oddly, the services were still visible in the Windows Services console. And the OPatch for the PSU completed correctly which confirmed that the required file structure was still in place.

Perhaps the other products needed to be reinstalled? But when I ran the InstallTool.cmd the only products available to re-install were the same Hyperion Foundation products that we saw in ConfigTool.

Research

I opened an SR with Oracle to help me determine the cause. We ran the usual diagnostics and checked files related to inventory such as the HYPERION_HOME/inventory/contentsXML/comps.xml and EPM_INSTANCE/config/foundation/11.1.2.0/reg.properties files. But that didn't show us anything useful.

The Oracle SE hit on the idea of using ProcMon to watch what files were being processed. This pointed us to the HYPERION_HOME/.oracle.products file as the file referenced during configuration. After comparing the .oracle.products file to our production server we confirmed that the file in our QA environment got damaged somewhere along the line.

Solution

The .oracle.products file isn't used for normal operations. It is just referenced and updated during installation and configuration. So the file could be damaged for months or years and you would never notice until you run the configurator or try to install or uninstall products.

One solution was to backup the existing .oracle.products file, remove all of the <InstallabelItem> and <products> nodes from the XML file, then run InstallTool and reinstall the required products. Because this would overwrite all of the existing .ear and .jar files all previous patches would also need to be re-applied.

While this is a reasonable solution it involves several manual steps which can be prone to error. And it can be time consuming to reapply all of the patches.

The other solution was simple and brilliantly elegant. Just take a clean server and run InstallTool to install the products that are on the damaged server, then copy the .oracle.products file from the temporary server to the damaged server.

I used VirtualBox to quickly spin up a virtual machine to handle this task. Note that all I needed to do is install the products to generate the .oracle.products file. There was no need to worry about connecting to an existing WebLogic instance, or accessing databases, or any of the other configuration tasks. Plus I already had a server image I could deploy in VirtualBox so I completed the whole process in an hour.

Once the .oracle.properties file was copied I was able to successfully complete the redeploy and confirm that the PSU fixed the issue we were facing.

Takeaways:
1. ProcMon is a useful tool for finding missing or damaged files and registry keys
2. VMs are useful for troubleshooting and problem resolution. Everybody gives away hypervisors so be sure to keep something handy for when you need it.
3. The .oracle.products file is used by InstallTool and ConfigTool to determine what is installed and configurable on the server.

May 23, 2016

Auditing Financial Reports

You know how it goes. Somebody starts a project and then forgets about it. Or there are multiple versions of a report that were made for various purposes. Or a report was critical at one point but eventually interest waned and is now unused. Whatever the cause you wind up with a bunch of reports, books, and batches that are never used. But you can't tell which ones aren't used if you don't have an audit trail.

There other uses for an audit trail beyond finding which reports are used. Maybe you need to know who runs the reports, or how frequently reports are generated, or if reports are run at a certain times so you can plan maintenance. Or maybe you're just curious.

Oracle docs 1353965.1 and 1947774.1 show the FRLogging.log file holds the audit data of Financial Reporting objects. The FRLogging.log contains a bunch of information including start and stop times for reports and books, steps taken to produce output, queries used to pull from the data source, and other details.

The Problem

As noted in the 1947774.1 document you configure logging in the MIDDLEWARE_HOME/user_projects/domains/EPM_System/config/fmwconfig/servers/FinancialReporting0/logging.xml file. By default BI+ limits the log file size to 1Mb. When the log reaches that limit the FRLogging.log file is renamed by appending a sequential number to the file name and a new log file is created. By default only 50Mb of log files are retained.

This is good news because BI+ won't create an infinite number of logs which could fill your drive and crash your system. But this is bad news because you can blow through 50Mb of logging in an afternoon of running multiple report batches during month end close, thus losing historical data.

Also the logs contain a lot of information that isn't interesting from an audit stand point. We just want lines that show the name of the report as well as the time it ran and the user who ran it.

We have to parse the FRLogging logs to get just the summary information. But we can't process the FRLogging.log file directly because it is held open by the Financial Reporting services and new records are constantly being appended.

The Solution

Windows has a .NET class called FileSystemWatcher. This will monitor a file or folder and take action when file system events occur such as creating, deleting, or renaming a file. We can use this feature to monitor the FRLogging.log file. When it gets renamed we find the newest numbered file in the folder and process that. This allows us to process a file that is not in use and prevents us from processing a log file more than once.

There are several lines in the FRLogging.log that are useful for audits. The START: record gets added when the job starts and the JOB: record gives a completion summary of the job including the total time it took the job to complete. In between these are one or more REPORT: records. If you run just the report you get one REPORT: record but for a book or batch you will get multiple records, one for each report generated.

Each of those lines contain the string "start:" (note the colon in the string). For the JOB: and REPORT: records this string identifies the start time of the event while for the START: records the string is the record identifier. No other records contain this string so this is a sure way to parse the log for the audit data.

To put this all together we need a script that will run FileSystemWatcher, wait for the FRLogging.log file to get renamed, find the latest numbered file, do a case-insensitive search for lines with the string "start:", and drop those lines into a file we can retain for as long as we need. This PowerShell script accomplishes those tasks:

# PoSh script to keep a record of all report usage

#   Use the FileSystemWatcher to monitor the FRLogging.log.  When the file gets renamed

#   we pull the job information from the most recent file and save it to an audit log

#

$Folder = `

"D:\Oracle\Middleware\user_projects\domains\EPMSystem\servers\FinancialReporting0\logs"

$File   = "FRLogging.log"   # File to monitor with FileSystemWatcher

$Filter = "FRLogging*.log"  # Filter to find the most recently renamed file

$Output = "D:\FRAudit.log"  # Running audit trail

# Configure the FileSystemWatcher.  Create the watcher object, then define the

# folder and file to watch

#

$watcher        = New-Object System.IO.FileSystemWatcher

$watcher.Path   = $Folder

$watcher.Filter = $File

# Define the action to take when the file we are watching gets renamed

#

$renamed = Register-ObjectEvent $watcher "Renamed" -Action {

   # Get the FullName of the most recent numbered FRLogging*.log file

   $LogFile = (Get-ChildItem $Folder $Filter | Sort LastWriteTime -desc)[1].FullName

   # Do a case insensitive search for all lines that contain the string "start:"

   $NewRecs = get-content $LogFile | where {$_ -match "start:"}

   # And append those lines to our running audit trail

   $NewRecs | out-file $Output -append

}

If you use this script you will need to change the value for the $Folder variable to match your environment. The value of the $Output variable is arbitrary but must be the full path to a file for your running audit trail. The tricky part of the script is the line where we find the most recent log file. Get-ChildItem returns an array of files matching the filter in the folder. We sort that array by LastWriteTime in reverse order and take the second element of the array (index [1]) to get the latest numbered file since the first element (array index [0]) is the FRLogging.log file.

But how do we make sure the script stays running to continuously monitor the FRLogging.log file?

An easy solution is to use Windows Task Scheduler to create a task that runs at system startup. You can also start the task manually when you first create it or if you adjust it later. Since there is nothing to terminate the script the task keeps running in the background. We basically get a service that starts automatically at every reboot.

Our new audit log is just a raw dump of the the useful lines from the FRLogging.log file. You will still need to parse that data to isolate the information of interest to you. (I have another PowerShell script to do that and will send it to you if you are interested.) But the key point is that we can leverage the transient information Oracle gives us and generate a persistent audit trail. And we can do it for free using built-in tools and with minimal overhead.

April 21, 2016

Permanent temporary files

We update the metadata for our HFM applications at least once per month. Before updating the shared dimension library and deploying the production application we test all of the changes in our development environment.

During a recent test the deployment in development failed. Usually a failure is during the deployment phase due to some misconfiguration in the new dimension members. In this case the deployment failed during the initial export with the error message:

Detail : An error was encountered while executing the Application Export: An error was encountered during export: The file exists.

Oracle Doc ID 1639083.1 shows the same error for Planning applications. Because Planning and HFM both use the EPMA dimension and application libraries you can often extrapolate from the Planning solutions to solve HFM problems.

The article says the <SystemDrive>:\Windows\temp folder is full and the solution is to delete files from that directory. But my C:\ drive wasn't full and there weren't many files in the \windows\temp directory.

Troubleshooting

According to the Oracle doc we were running out of file names. But where?

The deployment is performed by EPMA and it fails before HFM starts its processing. (Years ago an Oracle support engineer told me that during a deployment all the processing up to 20% is EPMA and after that it is HFM. At any rate we don't get past the export so I know definitively that the problem was with EPMA.)

SysInternals has a bunch of powerful utilities for Windows systems. I've been using these utilities since the WindowsNT days. The author of the utilities, Mark Russinovich, was eventually hired as a technical fellow at Microsoft and is now the CTO of Azure. He's one of the smartest guys you can follow and he's an engaging and articulate speaker if you see him give a presentation.

In this case I had to monitor the processes on the EPMA server to see what caused the failure. From live.sysinternals.com I downloaded ProcMon and ran it on the EPMA server. ProcMon will monitor everything on the server so you need to apply filters to isolate the information you want. A handy technique is to let it capture for a few seconds, then stop the monitor, right-click any processes you don't care about such as explorer.exe, svchost.exe, your anti-virus software, etc and select Exclude. This adds the exclusions to the current filter. Repeat until you only have the handful of processes of interest, in this case anything related to EPMA. Then clear the log, start the capture, and redeploy.

Once the deployment failed I stopped the capture and saved the results. ProcMon captures access to the file system and registry as well as network and process thread events, so I removed everything except file system requests. I didn't have to scroll down very far to see thousands of events from EPMA_Server.exe with a result of NAME COLLISION. These are all in the AppData\Local\Temp directory of the user account under which the services run.

The files are named tmpXXXX.tmp where XXXX is appears to be a random set of 4 hex digits. Some of these files are created during application deployment. We do a nightly LCM export of EPMA for backup purposes and which also creates .tmp files. Some of the .tmp files are XML properties of the applications and some are binary files. Since this is our dev environment and we have many test applications, do regular deployments of HFM applications, and export EPMA nightly we ran out of the 65,535 possible names for the .tmp files.

Solution

As the Oracle doc implies, we just delete the .tmp files. They are, after all, just temporary files in a temp directory. Why Oracle doesn't delete them automatically or overwrite existing files instead of throwing an error is a topic for philosophers and software engineers to resolve.

Another location where files accumulate is \Oracle\Middleware\user_projects\epmsystem\tmp. That folder holds temp files for shared dimension library updates, diagnostic framework, and web logic, among others. Beneath this \tmp folder is a folder named EPMA that holds temp XML files with all properties of all dimensions in redeployed applications as well as temp ADS files created during LCM exports from EPMA. Also beneath the \tmp folder is a folder oracle\tmp which has subfolders holding log files generated during HFM data syncs.

For us that issue came to light since we not only do nightly LCM exports of EPMA, we also have reporting applications that are updated nightly. We use HFM taskflows to update the metadata, redeploy the applications, and run datasyncs to load data. Because the .xml files generated during the redeploy can be large and we went through many rounds of testing, the data drive eventually filled to capacity which killed all activity on our EPMA server.

Our solution is regular maintenance to delete temp files in both the AppData\Local\Temp and the epmsystem\tmp folders on servers running the Hyperion EPMA Server service. This task is easy enough to complete manually that I have not created an automated process for it.

But it is curious that Oracle doesn't automatically delete, or provide facilities for automatically deleting, these temporary files. If you know of anything built into the products to address these temporary files please share that information in the comments below.

March 24, 2016

Automating HFM form updates

We have lots of different forms in our HFM applications. Many of these are used for reporting and analysis on various aspects of the income statement, balance sheet, and cash flow statements. We also have a variety of supplemental forms for tax, human resources, and other business purposes.

http://xkcd.com/1566/

For these forms the year on the point of view should be the current fiscal year. Although this can be adjusted by the users it is more convenient if the default POV is the current year.

We have over 100 such forms. After we complete year-end close and roll forward to the next fiscal year all of these forms need to be adjusted to update the default POV for the new year. It is a tedious task to do this by hand.

Strategy

There is a graphical interface for designing and editing forms. This gives a friendly way of representing the script that is used to define the dimensions, members, and format of the form. From the editor there is a button on the Actions bar that lets you edit the script directly.

Notice the line that starts with BackgroundPOV=. This defines the default POV for the form. So while one option is to use the GUI tools to update the POV, it is often quicker to edit the script and save the update.

Because the form is defined in a script, the script is what gets saved when we do LCM exports. The relative path is ..\resource\Forms\Web Forms\. Beneath that are the folders and forms as defined in workspace. Each form is saved in an .xml file, but the content of the .xml file is primarily the script that we see in workspace.

Our strategy is to export the forms, edit the .xml files to update the year in the BackgroundPOV, then use LCM import the updated forms.

Solution

The key is generating a script to automate the change. Powershell makes it easy to traverse a folder and subfolders for documents. In this case the filter is easy because the forms all reside beneath a single parent folder and are all defined in .xml files.

We open each file in turn and look for the BackgroundPOV line. The POV is defined using the standard HFM scripting style of <dimension>#<value> with periods separating the dimension definitions. So we search for the string .Y# then use a simple substring to find the year value and replace it with the new value.

The sample script is:

# PoSh script to update POV in the web forms in HFM application with the new year

#    We use the LCM export of the forms for the application.  The exports include the

#    report script in the .xml files.  We search for the BackgroundPOV line and replace

#    the Y# value with the new year.

#

#    I'm assuming that the current year is correct and so we increment the year.

#    We could use a similar techinique and replace the year with a static value

#

# All of the folders that have the web data forms that have to be updated

[array]$Dirs = #Folder 1, `

               #Folder 2, `

               # ...      `

               #Last folder 

foreach ($D in $Dirs) {

    # Get all of the .xml files in the current web form folder

    #

    $Files = Get-ChildItem -path $D -filter '*.xml'

    foreach ($FN in $Files) {

        $FN

        # Read and loop through the .xml file

        #

        $Script = Get-Content $FN.FullName

        for ($i=0; $i -le $Script.Length; $i++) {

            # Found the line we need to change

            #

            if ($Script[$i] -match 'BackgroundPOV') {

                $L = $Script[$i]

                # Get the 4 digit year.  POV is standard nomencalture so search for .Y#

                #

                $Year = $L.Substring($L.IndexOf('.Y#')+3,4)

                # Increment the year.  Note we convert to integer then back to string

                #

                [string]$New = [int]$Year + 1

                # Replace the year in the script line

                #

                $Script[$i] = $L.Replace($Year,$New)

                # There is only one line per file so exit the for loop

                #

                $i = $Script.length

            }

        }

        # Overwrite the old file with the new

        #

        $Script | out-file $FN.FullName

    }

}

Here are some things to note about the script. I force the list of folders into an array even if I need just one folder; this gives the script flexibility to use any number of folders. I hard code the path for the forms but that could be replaced with a prompt or command line parameter. I assume that the current year in the form is correct and increment the value, but you can replace that with a static string if you need to updated forms with a specific value. Also, I did not use the -Recurse option in the Get-ChildItem commandlet, but this can be used to traverse all subfolders in the -Path to process all of the .xml files in all directories. And it is good to have a backup, so before you run the script either do two LCM exports or work from a copy so you can return to a known good state if things go belly up.

I assume you know how to export and import using LCM so I don't describe that process. My goal is detailing the strategy for doing the mass update and showing the sample script that you can customize for your needs. The text files generated by LCM lend themselves to scripting techniques like this if you have similar issues with other products. And while I'm personally a fan of PowerShell for scripted solutions, there are other scripting tools available if your comforts lie elsewhere.

March 15, 2016

SmartView and XML

Our user reported the following error while using SmartView:

XML Load Error: Invalid xml declaration

Fortunately this happened after all data had been submitted for month-end close, so I had some time to research the problem. Unfortunately it happened on Super Bowl Sunday, so there was still a time crunch because nachos wait for no man. To make the situation weirder the user had been working fine when Excel crashed and after restarting Excel she couldn't log back in.

The error says it is a problem with XML so I started there. SmartView keeps a few .xml files in the cfg folder. Properties.xml is just a template for the proprties.xml file that lives in the user profile at AppData\Roaming\Oracle\SmartView. That file holds the default URL for connecting to the SmartView provider website and the list of previously used URLs. We tried replacing the user's properties.xml file with the default .xml we distribute to force users to connect to the correct URL. That didn't help.

SmartView also uses some .xml files on the server end in the folder Oracle\Middleware\EPMSystem11R1\products\FinancialManagement\Web\HFMOfficeProvider\TaskList_Config. I compared those against our development environment and the files checked out.

I searched the Oracle knowledge base but there isn't much information there. I found one article which indicated this might be a problem with the user provisioning. I removed the user account from all of the groups, saved it, then added it back to the groups. But she got the same error when logging in from SmartView.

We use Active Directory domain accounts for access to Hyperion and Essbase so users don't have to remember or sync multiple passwords for multiple accounts. In desperation we provisioned a native directory account and gave it the same access as the problem Active Directory account. This provided a temporary work around but what about a permanent solution?

Oracle support provided the answer. There is a table in the HFM database name <appname>_USERPARMS which holds parameters for each user, where <appname> is the name of the HFM application.. The parameters are stored as hex blobs and some of those blobs are in XML format. If you see any entry starting with 0x3C that could be the start of an XML entry since 3C is the hex value for the < character which is the first character of any XML string. The speculation is that when Excel crashed it caused one of these entries to be corrupted.

This also explains why creating a native directory account worked around the issue. The native account is recognized as a different account and had a different value in the Username field.

Running the command:

DELETE FROM <appname>_USERPARAMS WHERE Username LIKE '<youruser>%'

removed all of the entries for the problem user, replacing <appname> with the name of the HFM application and <youruser> with the username having problems. The entries got reset to their defaults on the next logon and resolved the user's login issue from SmartView.

Like the EAL and IIS issue, there are other errors that can be resolved with the same technique. Doc ID 1584266.1 shows that deleting the entries for a user will resolve the error:

Unknown Error in HFMProviderObject reference not set to an instance of an object.

Doc ID 1533030.1 shows that deleting all entries for the user or some specific parameters will resolve the error:

The parameter is incorrect. Error Reference Number: {319B0CB1-3B2D-4963-BB8F-025D2440A15D}

Doc ID 1371461.1 shows that deleting specific keys will resolve the error:

There are no rows to display for the current grid definition

Doc ID 1515682.1 shows that deleting specific keys will resolve the error:

An error has occurred. Please contact your administrator.

Error Number:13

Error Description:Type mismatch

Error Source:Microsoft VBScript runtime error

Page On which Error Occurred:/hfm/mbrsel/mbrsel.asp

All of these problems involve some sort of corruption of one or more of the parameters held in the USERPARMS table; the solution is to remove the offending parameters and let HFM recreate them with the default values. If someone was really ambitious they could translate the blob from hex to ASCII, determine what was missing, correct the string, translate it back into a hex blob, and update the BLOBData field for that ParameterKey. But I'm not that ambitious and the quickest path is the supported solution of deleting the records and getting on with your life.

January 26, 2016

EAL and IIS

Oracle Essbase Analytics Link for HFM (EAL) is a pretty cool product. We use it to move data from HFM to Essbase over EAL bridges. The EAL bridge defines the source and target applications as well as configuration parameters for the data transfer. Data can be moved on demand but we can also create a transparent bridge which lets us have near real-time replication of our HFM data into Essbase.

https://www.pinterest.com/pin/100416266663858663/

This comes in handy for reporting purposes. We use HFM to do the translations and consolidations of our financials and leverage Essbase for reporting. During the close cycle we don't have to sacrifice resources needed for consolidations with reporting activity. In addition we can leverage the Essbase calculation engine to add other accounts or dynamic scenarios that we use for reporting but don't need for normal close calculations. So we let Essbase do the calculations for prior year or budget-to-actual comparisons so that HFM and Financial Reporting don't have to do it.

You can also do some management of the metadata during the transfer. You can filter dimensions so you only bring over the members you need in Essbase. You can rename dimensions which can be handy if you have legacy spreadsheets or reports that use different names. You can also migrate security so that the security on the Essbase application matches the HFM application.

Product description

EAL has a few moving parts. There is the EAL service (Hyperion Essbase Analytics Link Server - Web Application) that oversees and manages operations. The Data Synchronization Service (Hyperion Essbase Analytics Link - DSS) is responsible for moving data from HFM to Essbase and does the bulk of the work. (Usually these two services are installed on a server other than the HFM or Essbase servers so as to not impede performance of other HFM and Essbase activity). The Financial Management Connector lives on one HFM server to facilitate communication with the HFM applications. And there is an add-in for Essbase Administration Server (EAS) that is used to configure all of the pieces and create bridges to move data from HFM to Essbase.

There are also some back-end pieces these services need. A RDBMS database is needed to store the configuration information which can be on either Oracle or MS SQL server. The Financial Management Connector uses IIS to host a website named livelink that provides access the .dlls required to move the data. If you are running HFM on Windows 2008 or later then you have to enable IIS 6 compatibility.

Problem and Solution

Our situation was a patch for EAL that was rolled back partway through the install. When patching EAL the current version has to be completely uninstalled before the new version is installed. Because the configuration data is held in the database that information is retained so the upgrades are usually painless.

After the rollback we tried to refresh the metadata for an existing bridge. We use dynamic accounts so I deleted the existing region for the HFM application and created a new region to enable the dynamic accounts. But when I tried to create the new region I saw this message in the EAS status window:

Data Synchronization Server database cannot be created

This points an an issue with the DSS. After confirming that DSS was correctly configured and the database was valid we checked the dss.log file and saw:

[23 Nov 2015 16:21:39] [dbmgr] ERROR:          HR#07722: Cube 'main_cube' of application 'MyApp' is not registered. 

[23 Nov 2015 16:21:51] [dbmgr] ERROR:          last message repeated 1 more time 

[23 Nov 2015 16:21:51] [dbmgr] WARN :          HR#01566: can't open 
directory 
"D:\oracle\product\EssbaseAnalyticsLink\oem\hfm\MyApp\Default": The 
system cannot find the path specified.   [3]

But the path did exist. We tried adjusting permissions but that was no help. So we checked the eal.log which showed:

[2015-Nov-23 16:24:08]  java.lang.OutOfMemoryError Exception [HR#09746]: in jni_GetStringUTFChars (src/jvm/api/jni/jni.c:963). 

[2015-Nov-23 16:24:08] Attempting to allocate 256M bytes 

[2015-Nov-23 16:24:08] There is insufficient native memory for the Java 

[2015-Nov-23 16:24:08] Runtime Environment to continue. 

[2015-Nov-23 16:24:08] Possible reasons: 

[2015-Nov-23 16:24:08]   The system is out of physical RAM or swap space 

[2015-Nov-23 16:24:08]   In 32 bit mode, the process size limit was hit 

[2015-Nov-23 16:24:08] Possible solutions: 

[2015-Nov-23 16:24:08]   Reduce memory load on the system 

[2015-Nov-23 16:24:08]   Increase physical memory or swap space 

[2015-Nov-23 16:24:08]   Check if swap backing store is full 

[2015-Nov-23 16:24:08]   Use 64 bit Java on a 64 bit OS 

[2015-Nov-23 16:24:08]   Decrease Java heap size (-Xmx/-Xms) 

[2015-Nov-23 16:24:08]   Decrease number of Java threads 

[2015-Nov-23 16:24:08]   Decrease Java thread stack sizes (-Xss) 

[2015-Nov-23 16:24:08]   Disable compressed references (-XXcompressedRefs=false)

This looked like it was a memory issue so we tried changing some of the options in the EAL configuration tool. There are also some java heap options in the registry at HKLM\SYSTEM\CurrentControlSet\services\Hyperion Essbase Analytics Link Server - Web Application\Parameters in the string value CmdLine. None of this helped.

Finally we checked the livelink web site configuration and noticed that it was using the DefaultAppPool. Ordinarily the default pool enables 32-bit applications. But EAL requires 64-bit applications so the application pool has to have 32-bit applications disabled. We already had an application pool specifically for EAL but between the partial upgrade and the roll-back that got reset. Reconfiguring the LiveLink web site to use the 64-bit application pool resolved the problem.

There are other errors that can occur if you don't disable 32-bit applications for livelink. Doc ID 1997467.1 reports that you can get this error in the eal.log:

An unknown error has occurred in the HsxAuthentication object.

And who doesn't love an unknown error?

Doc ID 1489138.1 reports that the eal.log will show this error:

Server/Cluster is incorrectly configured. Please reconfigure your Cluster or Server 
connection. 

Server/Cluster is incorrectly configured. Please reconfigure
 your Cluster or Server connection.

And, yes, the error is duplicated in the log file.

These are different errors but they all have the same resolution which is to create an application pool for EAL, set Enable 32-Bit Applications to false, and assign that app pool to the livelink website. This is an easy fix and an easy thing to check as you are troubleshooting EAL issues.