TH AdminGuide 3.0.0
TH AdminGuide 3.0.0
TH AdminGuide 3.0.0
Transformation Hub
Software Version: 3.0.0
Administrator's Guide
Legal Notices
Micro Focus
The Lawn
22-30 Old Bath Road
Newbury, Berkshire RG14 1QN
UK
https://www.m icrofocus.com
Warranty
The only warranties for products and services of Micro Focus and its affiliates and licensors (“Micro Focus”) are set forth in the express
warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional
warranty. Micro Focus shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is
subject to change without notice.
Copyright Notice
© Copyright 2019 Micro Focus or one of its affiliates.
Confidential computer software. Valid license from Micro Focus required for possession, use or copying. The information contained
herein is subject to change without notice.
The only warranties for Micro Focus products and services are set forth in the express warranty statements accompanying such products
and services. Nothing herein should be construed as constituting an additional warranty. Micro Focus shall not be liable for technical or
editorial errors or omissions contained herein.
No portion of this product's documentation may be reproduced or transmitted in any form or by any means, electronic or mechanical,
including photocopying, recording, or information storage and retrieval systems, for any purpose other than the purchaser's internal use,
without the express written permission of Micro Focus.
Notwithstanding anything to the contrary in your license agreement for Micro Focus ArcSight software, you may reverse engineer and
modify certain open source components of the software in accordance with the license terms for those particular components. See below
for the applicable terms.
U.S. Governmental Rights. For purposes of your license to Micro Focus ArcSight software, “commercial computer software” is defined at
FAR 2.101. If acquired by or on behalf of a civilian agency, the U.S. Government acquires this commercial computer software and/or
commercial computer software documentation and other technical data subject to the terms of the Agreement as specified in 48 C.F .R .
12.212 (Computer Software) and 12.211 (Technical Data) of the Federal Acquisition Regulation (“FAR”) and its successors. If acquired
by or on behalf of any agency within the Department of Defense (“DOD”), the U.S. Government acquires this commercial computer
software and/or commercial computer software documentation subject to the terms of the Agreement as specified in 48 C.F .R . 227.7202-
3 of the DOD FAR Supplement (“DFARS”) and its successors. This U.S. Government Rights Section 18.11 is in lieu of, and supersedes, any
other FAR, DFARS, or other clause or provision that addresses government rights in computer software or technical data.
Trademark Notices
Adobe™ is a trademark of Adobe Systems Incorporated.
Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation.
Documentation Updates
The title page of this document contains the following identifying information:
l Software Version number
l Document Release Date, which changes each time the document is updated
l Software Release Date, which indicates the release date of this version of the software
To check for recent updates or to verify that you are using the most recent edition of a document, go to:
https://www.m icrofocus.com/support-and-services/documentation
Support
Contact Information
Phone A list of phone numbers is available on the Technical Support
Page: https://softwaresupport.softwaregrp.com/support-contact-information
Contents
Chapter 1: Transformation Hub 6
Management Center (ArcMC) 7
SmartConnectors 7
Logger 7
ArcSight Investigate 8
Chapter 2: Producing and Consuming Event Data 9
Producing Events with SmartConnectors 9
Consuming Events with ESM 10
Consuming Events with Logger 10
Sending Transformation Hub Data to Logger 11
Example Setup with Multiple Loggers in a Pool 12
Consuming Events with ArcSight Investigate and Micro Focus Vertica 12
Consuming Events with Third-Party Applications 13
Consuming Transformation Hub Events with Apache Hadoop 13
Architecture for Kafka to Hadoop Data Transfer 14
Using Apache Flume to Transfer Events to Hadoop 14
Setting Up Flume to Connect with Hadoop 15
Sample Flume Configuration File 16
Setting Up Hadoop 17
Connectors in Transformation Hub 18
Configuring Consumers and Producers for Availability 18
Chapter 3: Securing your Transformation Hub deployment 20
Changing Transformation Hub Security Mode 20
Chapter 4: Managing Transformation Hub 22
Licensing Transformation Hub 22
Managing Transformation Hub through ArcMC 23
Enabling Transformation Hub Management through ArcMC 24
About the Transformation Hub Manager 24
Connecting to the Transformation Hub Manager 24
Managing Clusters 25
Viewing Information About a Cluster 26
Managing Brokers 26
Viewing Broker Details 26
Summary 27
Metrics 27
Messages count 27
Per Topic Detail 27
Managing Topics 27
Creating Topics 28
Viewing Topic Details 29
Topic Summary 30
Metrics 30
Operations 30
Partitions by Broker 31
Consumers consuming from this topic 31
Partition Information 31
Managing Consumers 31
Viewing Consumer Details 32
Managing Preferred Replicas 32
Managing Partitions 33
Configuring Topic Partitions Based on Number of Consumers 33
Graceful Shutdown and Rebooting of Transformation Hub Nodes 34
Adding a New Worker Node 34
Backing Up and Restoring Master Nodes 35
Removing a Node 37
Removing a Crashed Node 37
Replacing a Crashed Master Node 38
Pushing JKS files from ArcMC 39
Liveness Probes 39
Chapter 5: Managing Transformation Hub Topics 43
Default Topics 43
Data Redundancy and Topic Replication 44
Managing Topics through ArcMC 44
Exporting and Importing Routes 44
Stream Processor Groups 45
Appendix A: Command reference 47
Glossary 49
SmartConnectors
SmartConnectors serve to collect, parse, normalize and categorize log data. Connectors are available for
forwarding events between and from Micro Focus applications like Transformation Hub and ESM, enabling
the creation of multi-tier monitoring and logging architectures for large organizations and for Managed
Service Providers.
The connector framework on which all SmartConnectors are built offers advanced features that ensures the
reliability, completeness, and security of log collection, as well as optimization of network usage. Those
features include: throttling, bandwidth management, caching, state persistence, filtering, encryption and
event enrichment. The granular normalization of log data allows for the deterministic correlation that
detects the latest threats including Advanced Persistent Threats and prepares data to be fed into machine
learning models.
SmartConnector technology supports over 400 different device types, leveraging ArcSight’s industry-
standard Common Event Format (CEF) for both Micro Focus and certified device vendors. This partner
ecosystem keeps growing not only with the number of supported devices but also with the level of native
adoption of CEF from device vendors.
Logger
Logger provides proven cost-effective and highly-scalable log data management and retention capabilities
for the SIEM, expandable to hundreds of nodes and supporting parallel searches. Notable features of
Logger includes:
l Immutable storage
l High compression
l Archiving mechanism and management
l Transformation Hub integration
l Advanced reporting wizard
l Deployed as an appliance, software or cloud infrastructure
l Regulatory compliance packages
ArcSight Investigate
ArcSight Investigate simplifies security investigations using advanced analytics to proactively hunt for and
defeat unknown threats to decrease the impact of security incidents. Powered by Micro Focus Vertica's
high-performance database, and including integration with Hadoop, threat analysis is simplified using
built-in Vertica-based analytics in a dashboard-driven and intuitive hunt interface."
(Acknowledgments do not indicate that consumers, such as Logger, have received the event data, only that
Transformation Hub itself has.)
Note: Performance impact due to leader acks is a known Kafka behavior. Exact details of the impact
will depend on your specific configuration, but could reduce the event rate by half or more.
For CEF topics, SmartConnector (version 7.7 or later) encodes its own IP address as meta-data in the Kafka
message for consumers that require that information, such as Logger Device Groups.
For more information about SmartConnectors and how to configure a Transformation Hub destination,
refer to the CEF Destinations chapter of the SmartConnector User's Guide, available for download from the
Micro Focus Software Community.
Note: Kafka consumers can take up to 24 hours for the broker nodes to balance the partitions among
the consumers. Check the Transformation Hub Manager Consumers page to confirm all consumers
are consuming from the topic.
The number of Loggers in a Logger pool is restricted by the number of event topic partitions. For example,
if there are only five partitions, only five Loggers will receive the events. If you have more than five Loggers
configured in the same Consumer Group, some Loggers will not normally receive events, but will be
available as hot spares. When adding receivers, be sure to increase the number of event topic partitions.
See Managing Topics for more information.
Prerequisites
l Transformation Hub installed: Consult the Micro Focus Transformation Hub Deployment Guide.
l Flume installed: For information on how to install and configure Flume, refer to the Flume documentation,
available at https://flume.apache.org/releases/content/1.6.0/FlumeUserGuide.pdf.
l Storage system installed: Refer to your storage system documentation.
Procedure
Flume is controlled by an agent configuration file. You must configure Transformation Hub as the source
agent, your storage system as the sink agent, and ZooKeeper as the channel agent in this file.
Edit the agent configuration file to include the required properties, as in the table below. Configure other
properties as needed for your environment.
Required Kafka Source Configuration
Property Description
topic The Event Topic from which this source reads messages. Flume supports only one topic per source.
The required configuration varies. Refer to the Flume documentation for details on your storage system.
The section Consuming Events with Apache Flume provides an example of how to configure Apache
Hadoop as the sink.
####################################################
tier1.sources = source1
tier1.channels = channel1
tier1.sinks = sink1
tier1.sources.source1.type = org.apache.flume.source.kafka.KafkaSource
tier1.sources.source1.kafka.topics = th-cef
tier1.sources.source1.kafka.consumer.group.id = flume
tier1.sources.source1.channels = channel1
tier1.sources.source1.interceptors = i1
tier1.sources.source1.interceptors.i1.type = timestamp
tier1.sources.source1.kafka.consumer.timeout.ms = 150
tier1.sources.source1.kafka.consumer.batchsize = 100
tier1.channels.channel1.type = memory
tier1.channels.channel1.capacity = 10000
tier1.channels.channel1.transactionCapacity = 1000
tier1.sinks.sink1.type = hdfs
tier1.sinks.sink1.channel = channel1
tier1.sinks.sink1.hdfs.path = hdfs://localhost:9000/opt/\
hadoop/cefEvents/year=%y/month=%m/day=%d
tier1.sinks.sink1.hdfs.rollInterval = 360
tier1.sinks.sink1.hdfs.rollSize = 0
tier1.sinks.sink1.hdfs.rollCount = 0
tier1.sinks.sink1.hdfs.fileType = DataStream
tier1.sinks.sink1.hdfs.filePrefix = cefEvents
tier1.sinks.sink1.hdfs.fileSuffix = .cef
tier1.sinks.sink1.hdfs.batchSize = 100
tier1.sinks.sink1.hdfs.timeZone = UTC
Setting Up Hadoop
This is an overview of the steps necessary to install Apache Hadoop 2.7.2 and set up a one-node cluster.
For more information, see https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-
common/SingleCluster.html, or refer to the Hadoop documentation for your version.
To install Hadoop:
1. Be sure that your environment meets the operating system and Java prerequisites for Hadoop.
2. Add a user named 'hadoop'.
3. Download and unpack Hadoop.
4. Configure Hadoop for pseudo-distributed operation.
l Set the environment variables.
l Set up passphraseless SSH.
l Optionally, set up Yarn. (You will not need Yarn if you want to use Hadoop only storage and not for
processing.)
l Edit the Hadoop configuration files to set up a core location, a Hadoop Distributed File System
(HDFS) location, a replication value, a NameNode and a DataNode.
l Format the Name node.
5. Start the Hadoop server using the tools provided.
6. Access Hadoop Services in a browser and login as the user "hadoop".
7. Execute the following commands to create the Hadoop cefEvents directory:
hadoop fs -mkdir /opt
hadoop fs -mkdir /opt/hadoop
hadoop fs -mkdir /opt/hadoop/cefEvents
8. Execute the following commands to grant permissions for Apache Flume to write to this HDFS
hadoop fs -chmod 777 -R /opt/hadoop
hadoop fs -ls
9. Execute the following command to check Hadoop system status:
hadoop dfsadmin -report
10. Execute the following command to view the files transferred by Flume to Hadoop.
hadoop fs -ls -R /
For Producers
Configure the Initial Host:Port(s) parameter field in the Transformation Hub Destination to include all
Kafka cluster nodes as a comma-separated list.
Provide all Kafka cluster nodes for a producer and a consumer configuration to avoid a single point of
failure.
For more information on how Kafka handles this using bootstrap.servers, please see
https://kafka.apache.org/documentation/#consumerconfigs.
For consumers
Configure the Transformation Hub host(s) and port parameter field in the Receiver to include all Kafka
cluster nodes as a comma-separated list.
For more information on how Kafka handles this using bootstrap.servers, please see
https://kafka.apache.org/documentation/#producerconfigs.
Note: TLS performance impact is a known Kafka behavior. Exact details of the impact will depend on
your specific configuration, but could reduce the event rate by half or more.
You can change the Transformation Hub security mode after deployment, but this will cause downtime for
your Transformation Hub and associated systems, such as consumers, producers, and ArcMC. You will
need to make sure all Transformation Hub-associated systems are re-configured as well.
Caution: If the security mode change requires that Transformation Hub consumer or Transformation
Hub producer restarts, the producer or consumer must be disconnected from Transformation Hub
first. Consult the appropriate consumer or producer documentation for details.
5. Click the ... (Browse) icon to the right of the main window.
6. From the drop-down, click Undeploy. The post-deployment settings page is displayed.
7. Undeploy the Transformation Hub.
8. Follow the consumer and producer documentation to reconfigure those applications to align their
security modes to be the same as Transformation Hub.
9. Redeploy the Transformation Hub as outlined in the Transformation Hub Deployment Guide.
10. Reconnect the consumers and producers to the Transformation Hub. See the respective product
documentation for these procedures.
Undeploying Transformation Hub will remove all previous configuration settings. You should backup your
configuration data before performing this process, and then restore them after the re-deployment.
Note: You can also check the status of the restarted broker node in the Transformation Hub Manager.
Once the selected broker node is up and running, only then proceed to restart the next node.
License Verification
The license check result is logged in the file <root install folder>/k8s-hostpath-
volume/th/autopass/license.log. If there is a valid license, the log will include the text:
The Micro Focus ArcSight Management Center Administrator's Guide also explains in detail how to view
the list of Transformation Hub consumers, manage topics, routing rules, monitored metrics, and enabling
notifications.
1. From your local system, set up SSH forwarding and connect by using a command like the following:
ssh -L <local port>:<Transformation Hub Manager Service IP>:<port>
Once you connect, the browser displays the Clusters page. For more information, see Managing Clusters.
Managing Clusters
The Clusters page is the Transformation Hub Manager's home page. From here you can modify, disable
or delete a cluster from view in the Transformation Hub Manager (the cluster itself is not deleted), or drill
down into the cluster for more information.
Location: Clusters
Click the Cluster Name link. The Transformation Hub Manager displays the Cluster Summary page. For
more information, see Viewing Information About a Cluster.
1. Click Modify. The Transformation Hub Manager displays the Update Cluster page.
2. Update the appropriate fields, and click Save.
Editing the cluster is an advanced operation, and normally the cluster should never be edited.
Click Disable. Once a cluster has been disabled, a Delete button is displayed.
l If the cluster is not yet open, click Cluster > List in the navigation bar. Then click the Cluster Name link.
l If the cluster is already open, click Clusters > Cluster Name > Summary
Click the Topics hyperlink (number of topics) to show the topics in the cluster. For more information, see
Managing Topics.
Click the Brokers hyperlink (number of broker nodes) to show the broker nodes in the cluster. For more
information, see Managing Brokers.
Managing Brokers
On the Brokers page, you can see an overview of all of your Worker nodes and drill down into a node for
more information.
Note: The term Brokers refers to single nodes running Kafka (that is Worker Nodes, but not Master
Nodes).
Click the broker's Id link. The Broker Name ID opens. For more information, see Viewing Broker Details.
Summary
In the Summary section, you can see an overview of your broker, including the number of topics and
partitions located on it.
Metrics
In the Metrics section, you can view information about the data flow.
Messages count
In the Messages section, you can view a message view chart.
Click the Topic Name link in the Per Topic Details section. See Viewing Topic Details
Managing Topics
On the Topics page, you can run or generate partition assignments, add a new partition, and drill down
into individual topics for more information.
Location: Clusters > Cluster Name Topic > List
Note: The following topics are created by default. They are used internally by Transformation Hub
and should not be deleted, modified, or used by external data producers or consumers.
l __consumer_offsets
l _schemas
l th-internal-datastore
l th-internal-stream-processor-metrics
l th-con-syslog
l th-internal-avro
Click the Topic Name link. The Topic Name page displays the topic's summary, metrics, consumers, and
partitions. See Viewing Topic Details.
To add a partition:
Creating Topics
You can create a new topic on the Create Topic page.
Fill in the fields and click Create. For a discussion of field values, consult the Kafka documentation at
https://kafka.apache.org/documentation/#topicconfigs.
The number of custom topics you can create will be limited by Kafka, as well as performance and system
resources needed to support the number of topics created.
Note: Alternatively, you can also create topics on a managed Transformation Hub in ArcMC, or invoke
the kafka-topics command from the CLI on the Kafka pod using the kubectl exec command.
Topic Summary
In the Topic Summary section, you view information on the topic's replicas, partitions, and broker nodes.
Metrics
In the Metrics section, you can view information about the data flow.
Operations
In the Operations section, you can reassign partitions, generate partition assignments, add partitions,
update the topic configuration, and manually assign topics to broker nodes.
To reassign partitions:
To add a partition:
Partitions by Broker
In the Partitions by Broker section, you can see topic partition information and drill down to see details
for each broker.
Click the Broker link. The Topic Summary page displays information on the topic's lag, partitions, and
consumer offset.
In Transformation Hub Manager, users will see different offset values between CEF (Investigate or Logger)
topics and binary (ESM) topics. In CEF topics, the offset value can generally be associated with number of
events that passed through the topic. Each message in a CEF topic is an individual event. However, that
same association cannot be made in binary topics, since the data is in binary format.
New consumers can take some time to display properly. Give the process time to populate correct data.
Click the Topic Name link. The Topic Summary page displays information on the topic's lag, partitions, and
consumer offset.
Partition Information
In the Partition Information section, you can view information about the topic's partitions and drill down
for more information on each leader.
Click the Leader link. The Broker Name ID page displays the broker's summary, metrics, message count,
and topic details. See Viewing Broker Details.
Managing Consumers
On the Consumers page, you can see a list of consumers, view their type, the topics they consume, and drill
down into each consumer and topic for more information.
Click the Consumer Name link. The Consumer Name page displays details about the consumer. You can
drill down further for more information.
Click the Topic Name link. The Topic Name page displays details about the topic. You can drill down further
for more information.
1. Click the Topic Name. The Consumed Topic Information page displays information about the topic.
Click the topic name for more information.
Managing Partitions
You can reassign partitions for your cluster on the Reassign Partitions page.
Location: Clusters > Cluster Name > Reassign Partitions
Note: Please make sure that this procedure is always used for graceful reboots.
To gracefully reboot each cluster node (master nodes first, worker nodes last), and avoid disrupting the
flow of events, run the following commands:
ssh <node_ip/hostname>
kube-stop.sh
sync; sync
Run watch kubectl get pods --all-namespaces -o wide to monitor all pods and ensure they
are back in running state before moving on to the next node.
Note: A known issue exists where the kube-stop.sh script misinforms the user that services have
been stopped. However, in actuality, they have not been stopped yet. Monitor your processes to
ensure the Docker and cluster processes are finished before rebooting. Restarting without
ensuring the services are finishing will not reboot the server fully and will also affect the cluster health.
1. Set up and provision the new node according to the guidelines and system requirements given in the
Micro Focus CDF Planning Guide. Note the IP address of the new node for use in the following
procedures.
2. Modify your NFS server settings to add the new node to the /etc/exports file.
/etc/profile
/etc/profile.d/proxy.sh
/etc/security/limits.d/20-nproc.conf
/root/.bashrc
/root/.kube
/usr/bin/docker*
/usr/bin/kube*
/usr/bin/vault
/usr/lib/systemd/system/kubelet.service
/usr/lib/systemd/system/kube-proxy.service
/usr/lib/systemd/system/docker.service
/usr/lib/systemd/system/docker.service.d/http_proxy.conf
/usr/lib/systemd/system/docker-bootstrap.service
/usr/lib/systemd/system/docker-bootstrap.service.d/http_proxy.conf
All directories and files under folder $K8S_HOME except the $K8S_HOME/data directory.
3. Backup the etcd database.
Note: This requires temporarily stopping etcd, and then restarting it after the backup.
Restore
If some files were accidentally deleted, restore them from the most recent backup.
For example: If the file $K8S_HOME/scripts/uploadimages.sh was accidentally deleted, restore it
from the backup.
Note: The restored files must have the same owner and permissions as the original files.
In the event of the loss of the master node in a single master deployment, please contact Micro Focus
Support for help with the recovery.
Removing a Node
To remove an existing node from the cluster, open an SSH connection to the node and run the following
commands.
cd $K8S_HOME
./arcsight-uninstall.sh
Note: From a multi-master cluster with 3 master nodes, you can safely remove only one master node.
By removing one of three master nodes you will lose high availability, but the cluster will continue to
function. If you remove two of three master nodes, the cluster may become unavailable, and you will
then need to set up the cluster from scratch.
Note: Such an action needs to be performed manually by the cluster administrator, because there is
no way for the cluster to distinguish permanent node failure from temporary network connectivity
When this command is given, the cluster re-schedules the stateful containers (Kafka, ZooKeeper, stream
processors) to the remaining machines matching the container requirements (labels, resources).
--cacert=$K8S_HOME/ssl/ca.crt
--cert=$K8S_HOME/ssl/server.crt
--key=$K8S_HOME/ssl/server.key
--endpoints=[https://MASTER1_IPV4:4001,https://MASTER2_
IPV4:4001,https://MASTER3_IPV4:4001] endpoint health
2. Get the crashed master etcd member ID by checking the etcd cluster status:
$K8S_HOME/bin/etcdctl
--cacert=$K8S_HOME/ssl/ca.crt
--cert=$K8S_HOME/ssl/server.crt
--key=$K8S_HOME/ssl/server.key
--cacert=$K8S_HOME/ssl/ca.crt
--cert=$K8S_HOME/ssl/server.crt
--key=$K8S_HOME/ssl/server.key
4. Provision a new host with the same IP address as the original one. Run the installation as described in
the Transformation Hub Deployment Guide. If the crashed node was labeled originally (for example, to
host Kafka)Log in to label the new one with the same labels.
1. Prepare the .jks files you want to push and store them in a secure network location.
2. In ArcMC, click Administration > Respositories > New Repository.
3. In Name, Display Name, and Item Display Name, enter KAFKA_JKS
4. Enter other required details as needed, and then click Save.
5. Click Upload to Repository.
6. Follow the prompts in the upload wizard and browse to the first .jks file. Note: make sure to choose the
individual file option.
7. Upload as many files as needed by repeating the upload wizard.
Liveness Probes
A liveness probe is a Kubernetes feature that can be configured to detect problematic pods. Once detected,
Kubernetes will take action to restart a problematic pod. Liveness probes help ensure higher availability of
pods as well as a more robust cluster environment. Consult the Kubernetes documentation for a more
detailed explanation of liveness probes.
initialDelaySeconds Number of seconds after the container has started before liveness probes are initiated.
Note that the first probe execution after startup is not until initialDelaySeconds + periodSeconds.
periodSeconds How often to perform the probe.
timeoutSeconds Number of seconds after which the probe times out.
failureThreshold When a Pod starts and the probe fails, Kubernetes will try failureThreshold times before giving up and
restarting the pod.
1. Run:
kubectl -n <namespace> describe pod <podname>
2. Review the output. Look (or grep) for the line starting with the string Liveness...This will show
some of the probe's configuration.
1. Run:
kubectl get pods
2. If any pod shows 1 or more restarts, run:
kubectl -n <namespace> describe pod <podname>
3. Review any list of events at the end of the output. Liveness probe failures will be shown here.
regex A regular expression for matching against the application's log output.
l The literal property specifies a literal (exact match) search string. If the value matches a portion of
the log text, the liveness probe, on its next periodic check, will report a failure and restart the pod.
l The regex property is similar, except that a regular expression can be specified for the match. This
regex must conform to Java regex rules. To specify a literal value within the regex, use 4 backslashes to
escape it (\\\\). However, if a literal is needed, consider using the literal property instead of regex.
l Multiple search patterns can be specified per property, separated by 4 vertical bars (||||). A match on
any of the patterns will trigger the probe failure.
l There are no default values for these parameters. Log scanning is disabled in the default configuration.
l Matching across multiple rows is not supported. The match must occur on one log line.
l For example, to restart the CEF-to-Avro Stream Processor pod when the value, Setting stream
threads to d (where d could be any single digit), is found in the log, change the configuration
property "CEF-to-Avro Stream Processor liveness probes regular expression" to the following value .
Setting stream threads to \\\\d
Verification
To verify that log scanning is configured as intended, review the pod's log and look for entries containing
InputStreamScanner.
For the previous property example, the corresponding log line would be:
InputStreamScanner: Will scan for RegEx pattern [Setting stream threads to
\d]
• Default Topics 43
• Data Redundancy and Topic Replication 44
• Managing Topics through ArcMC 44
• Exporting and Importing Routes 44
• Stream Processor Groups 45
Default Topics
Transformation Hub manages the distribution of events to topics, to which consumers can subscribe and
receive events from.
Transformation Hub includes the following default topics:
Topic Name Event Type Notes
th-arcsight- For ArcSight product use only. Event data in Avro format for use
avro by ArcSight Investigate.
th-arcsight- For ArcSight product use only. Event data in JSON format for use
json-datastore by ArcSight infrastructure management.
In addition, using ArcSight Management Center, you can create new custom topics to which your
SmartConnectors can connect and send events.
/opt/arcsight/kubernetes/scripts/get-product-tools.sh
Script options can be displayed by passing the help parameter. For example:
# ./routes_rules.sh -h
Option Desciption
-e, --export Exports rules/routes to two files in directory [-d]. Required unless [--import] i s specified.
-i,--import Imports rules/routes from two files in directory [-d] . Required unless [--export] is specified.
-d, --directory Directory for import/export files. Optional, defaults to current directory.
-s, --server The FQDN of a remote Transformation Hub server. Do not set if script is run from target server.
-u, --user Web service user. Prompted for, if not provided.
-p, --password Web service password. Prompted for, if not provided.
-h,--help Print this usage message.
The number if stream processor instances than can be added to the group is only limited by the system
resources. Please add additional instances to the group only if required for event processing performance
reasons.
Description Command
Get logs from Get the name of the desired pode as follows:
a specific
# kubectl get pods -n arcsight-installer-xxxx | grep schemaregistry
module, like
the Schema th-schemaregistry-2567039683-9l9x 1/1 Running
Registry
Run this command to display the logs for the desired pod. You will need to add a -c (container) parameter if the pod
contains more than one container.
Glossary
C
Cluster
A group of nodes, pods, or hosts.
Consumer
A consumer of Transformation Hub event data. Consumers may be Micro Focus products such as Logger or ESM, third-party
products like Hadoop, or can be made by customers for their own use.
CTH
Collector in Transformation Hub (CTH). A feature where SmartConnector technology operates directly in Transformation
Hub to collect data.
D
Dedicated Master Node
A node dedicated to running the Transformation Hub Kubernetes control plane functionality only.
Destination
In Micro Focus products, a forwarding location for event data. A Transformation Hub topic is one example of a destination.
Docker container
A Docker container is portable application package running on the Docker software development platform. Containers are
portable among any system running the Linux operating system.
F
flannel
flannel (spelled with a lower-case f) is a virtual network that gives a subnet to each host for use with container runtimes.
Platforms like Google's Kubernetes assume that each container (pod) has a unique, routable IP inside the cluster. The
advantage of this model is that it reduces the complexity of doing port mapping.
I
Initial Master Node
The Master Node that has been designated as the primary Master Node in the cluster. It is from this node that you will install
the cluster infrastructure.
K
Kafka
An open-source messaging system that publishes messages for subscribers to consume on its scalable platform built to run on
servers. It is commonly referred to as a message broker.
Kubernetes
Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized
applications. It groups containers that make up an application into logical units for easy management and discovery.
L
Labeling
Adding a Kubernetes label to a Master or Worker Node creates an affinity for the workload to the Master or Worker Node,
enabling the node to run the specified workload on the labeled server.
M
Master Nodes
Master Nodes run the CDF Installer and process web services calls made to Transformation Hub. They connect to, and are
administered by, the ArcSight Management Center. A minimum of 1 Master Node is required for each TH cluster.
N
Network File System (NFS)
This is the location where the CDF Installer, Transformation Hub, and other components may store persistent data. A
customer-provisioned NFS is required. This environment is referred to in this documentation as an "external" NFS. Although
the CDF platform can host a CDF-provisioned NFS (Internal NFS), for high availability an External NFS service should
implemented.
Node
A processing location. In Transformation Hub and other containerized applications, nodes come in two types: master and
worker.
P
Pod
Applications running in Kubernetes are defined as “pods”, which group containerized components. Transformation Hub uses
Docker Containers as these components. A pod consists of one or more containers that are guaranteed to be co-located on
the host server and can share resources. Each pod in Kubernetes is assigned a unique IP address within the cluster, allowing
applications to use ports without the risk of conflict.
Producer
A gatherer of event data, such as a SmartConnector or CTH. Typically data from a producer is forwarded to a destination such
as a Transformation Hub topic.
R
Root Installation Folder
The root installation folder is the top level folder that the Transformation Hub, CDF Installer and all supporting product files will
be installed into. The default setting is /opt/arcsight. It is referred to as RootFolder in this document, supporting scripts, and
installation materials.
S
Shared Master and Worker Nodes
A configuration where both Master and Worker Nodes reside on the same hosts. This is not a recommended architecture for
high availabillty.
SmartConnector
SmartConnectors automate the process of collecting and managing logs from any device and in any format.
T
Thinpool
Using thin provisioning in Docker, you can manage a storage pool of free space, known as a thinpool, which can be allocated to
an arbitrary number of devices when needed by applications.
Transformation Hub
A Kafka-based messaging service that enriches and transforms security data from producers and routes this data to
consumers.
V
Virtual IP (VIP)
To support high availability, a VIP is used as the single IP address or FQDN to connect to a dedicated Master infrastructure
that contains 3 or more master Nodes. The Master Nodes manage Worker Nodes. The FQDN of the VIP can also be used to
connect to the cluster’s Master Nodes.
W
Worker Nodes
Worker nodes ingest, enrich and route events from event producers to event consumers. Worker nodes are automatically
load-balanced by the TH infrastructure.
Z
ZooKeeper
In Kafka, a centralized service used to maintain naming and configuration data and to provide flexible and robust
synchronization within distributed systems.