Multi-tenant logging with Kubernetes & OpenSearch

May 20, 2025 | relevant products: LOMOC

Multi-tenant logging with Kubernetes & OpenSearch: Real-world challenges and solutions

In modern cloud environments, effective logging is essential for operational reliability, troubleshooting and compliance. This becomes particularly challenging when a logging system has to serve multiple tenants, each with their own requirements for security, scalability, visibility and automation. Our partner RISE took on this challenge and presented its learnings and best practices at OpenSearchCon 2025. Interested parties can also watch the lecture here. In this article, we summarise the key findings and provide practical insights into the architecture, standardisation and operation of a multi-tenant logging platform based on Kubernetes and OpenSearch.

The search for the optimal logging stack: From Elasticsearch to OpenSearch

RISE's logging journey began back in 2015, when it implemented its first projects with ELK stacks and X-Pack in the Austrian public sector, including z/OS CICS applications. With growing experience, the decision to move away from proprietary licensing models matured. In 2021, a complete transition to OpenSearch was completed, allowing the organisation to establish an independent, fully managed logging service offering – free from commercial restrictions and with greater flexibility in platform operation – which led to our product LOMOC.

From a single case to a platform: Why standardisation became essential

As the platform grew, so did its complexity. New customer projects, different environments and a growing need for reusability led to creeping heterogeneity, which made operations increasingly difficult. To counteract this, RISE opted for a rigorous standardisation concept. Technically, this meant the consistent automation of all OpenSearch-relevant configuration aspects such as cluster structure and certificate management. At the requirements level, binding conventions were established for index names, field taxonomies and retention periods. Structured logging in JSON format also became mandatory. In operational use, it was particularly important to ensure a secure client structure and enable uniform monitoring through clear data and access separation.

Added value instead of infrastructure: platform engineering at RISE

Our partner RISE is pursuing a clear paradigm shift in platform operation – away from mere infrastructure provision and towards a comprehensive value-added model. The aim is to enable customers to gain independently usable insights from their log data. To this end, the platform provides reusable analysis tools and standardised interfaces that enable different roles such as developers, operations managers and security teams to work efficiently. The complexity of the infrastructure remains largely hidden: whether HA setup, shard allocation or parsing logic – everything is automated in the background. What remains is an abstracted and user-friendly interface that focuses on operational results rather than technology.

Kubernetes as the foundation: structure and access securely separated

The Kubernetes architecture is based on a clear separation between development, test and production environments. Within a cluster, each client is assigned its own namespaces. Access and deployment rights are controlled via LDAP groups, ensuring that only authorised developers can access their respective workloads. Implementation follows a GitOps approach with ArgoCD, so that changes are traceably versioned and rolled out via standardised workflows.

Log data flow: Integration of Kubernetes and OpenSearch

To make the integration between Kubernetes and OpenSearch efficient, RISE established an end-to-end data pipeline. Container logs are collected via the Kubernetes Logging Operator and transferred to Logstash, which then forwards them to OpenSearch in a structured format. Client separation is not only logical but also technical, with separate log flows for each namespace. Access is controlled on a client-specific basis, and preconfigured dashboards allow users to analyse their data directly without any detours.

Structure through standardisation: systematic indexing and parsing

A central element of standardisation was the introduction of a well-thought-out index naming scheme. This follows a fixed pattern that takes into account not only the log type but also the client, the cluster, the lifecycle (e.g. ‘default’ or ‘archive’) and the date. This allows retention policies (ISM policies) to be applied automatically. The Logstash pipelines were also designed to be modular: a default pipeline handles basic parsing, while individual pipelines can be added for specific customer requirements. Unstructured logs were also taken into account – through a pragmatic balance between standardisation and necessary flexibility.

Automated retention with ISM policies

The platform distinguishes between different retention classes to meet varying retention requirements. In addition to a general standard policy, there are long-term archiving solutions for compliance-relevant data and customisable rules for specific customer requirements. Correct assignment is carried out automatically via the index schema, which significantly reduces administrative effort.

Quick start with declarative onboarding

Customer onboarding is carried out using a declarative model with custom resources. The OpenSearchSyncData definition describes all the necessary parameters for a new client: cluster names, LDAP groups, project name and namespace. Based on this data, the corresponding OpenSearch tenant is automatically set up, roles and rights are assigned, and standard dashboards are loaded. Once the setup process is complete, log shipping starts – and the customer can begin analysis immediately.

Transparency through dashboards and quota monitoring

Various standard dashboards are automatically provided to support users, including ingress and container overviews and quota displays. The latter are based on regular queries of storage usage via the _cat/indices API and provide customers and operators with information about resource consumption at all times. This transparent reporting not only promotes cost control but also confidence in the platform.

Technological limitations and workarounds in multi-tenant operation

Despite all the progress made, certain limitations arise when using OpenSearch in multi-tenant environments. For example, the alerting plugin does not allow complete tenant separation: users who share the same backend role may be able to see or change each other's alerts. There is also a lack of self-service functionalities for log parsing. Logstash is not tenant-aware and changes can only be made manually, which requires a corresponding amount of organisational effort. Equally problematic is the lack of native ingest limits or tenant-related quotas in OpenSearch itself. A single tenant with a high data volume can impair the performance of the entire cluster. RISE addresses these limitations with external monitoring, custom alerting metrics and clearly defined operating concepts.

What we learned: Process clarity, responsibilities and monitoring

An important lesson learned at RISE was the importance of clearly defined support processes. Structured ticket templates, central communication channels and uniform escalation paths drastically reduced queries, misunderstandings and delays. Equally essential was the definition of technical and organisational service limits. Who is responsible for what – for example, alerting, parsing or role management – must be clearly defined to avoid uncontrolled shifts in responsibility and unpredictable changes. Last but not least, it became clear that without comprehensive monitoring, operating a multi-tenant platform is virtually impossible to manage. Early detection of ingestion lag, log delays or cluster instabilities makes the difference between proactive operation and reactive troubleshooting. RISE relies on a combination of Grafana, Prometheus and OpenSearch dashboards – supplemented by its own tools such as ‘LomocTop’ for system analysis.

Conclusion: Multi-tenant logging is more than a technical project

Building a scalable, secure and multi-tenant logging platform requires not only deep technical understanding, but also a well-thought-out process structure, strategic decisions and consistent standardisation. With its platform, RISE has not only created a powerful solution, but also provided a blueprint that can serve as a model for many companies and organisations facing similar challenges. RISE bundles this knowledge for you in the product LOMOC – and our customers can draw on the experience of RISE OpenSearch experts during product implementation!

Interested in this topic? - Talk to us!

Get in touch