Home / Resources / Manufacturing IT
Manufacturing IT

Manufacturing IT Support: When Your ERP Goes Down at 2 AM

PK
Patrick Krischel
May 28, 2026
7 min read
Laptop keyboard glowing in the dark during overnight manufacturing IT support

Most IT support contracts were designed for office buildings. Nine to five, Monday to Friday, with after-hours support that means leaving a voicemail and hearing back the next morning. Then manufacturing companies sign those contracts, and an entirely predictable thing happens. The ERP connection to the shop floor drops at 2 a.m., the night shift cannot post receipts to inventory, and the production schedule starts slipping in real time while a service ticket sits in a queue waiting for someone to clock in at eight.

I lead a global service delivery team of more than 500 people at Virteva, and a meaningful share of our work supports manufacturers, distributors, and other operations that run on schedules the office world does not see. The pattern of broken support contracts is consistent enough that I want to lay out what production-critical IT support actually requires, separate from what most providers are willing to deliver.

The Failure Modes That Define Manufacturing IT

When an office worker’s email is slow, the cost is annoyance. When a manufacturing system fails, the cost compounds by the minute. A few of the scenarios I have seen play out in the past year alone.

The ERP-to-shop floor connection drops. The manufacturing execution system can no longer pull work orders, post production counts, or update inventory. The shop floor falls back to paper. Two hours of paper records are then manually re-entered the next day, with errors. The financial close that month gets messy. The root cause was a Windows update that restarted a connector service without restarting the dependent application.

A SCADA system throws a configuration error. A line that runs three shifts cannot start the next batch because the historian is not capturing tags correctly. The plant manager does not know whether the line is actually safe to run with degraded telemetry. Production stops while someone tries to reach an IT contact who has been off-shift for six hours.

The MES stops communicating with the quality system. Parts ship without quality holds being released because the system that flags them is offline. The customer rejects the shipment a week later. Now there is a discrepancy investigation, a returned shipment, a customer escalation, and questions from regulators if the parts are in a regulated industry.

A ransomware indicator fires on a production server. The plant’s security team is two states away. The IT contact is asleep. The night supervisor has to decide whether to shut down a line, isolate a server, or wait. Most night supervisors in this scenario will wait, and waiting is sometimes the worst possible answer.

In each of these scenarios, the failure is not the technology. The technology will always fail eventually. The failure is the support model that is not designed for the time of day or the criticality of the system.

What Real 24/7 Looks Like

When a provider tells you they offer 24/7 support, ask the questions that make the claim falsifiable.

Who answers the phone at 2 a.m. on a Sunday? A specific person on a specific shift, sitting at a desk, currently working a queue. Not an on-call engineer who is asleep until paged. Not an answering service that takes a message. The difference is response time. A staffed engineer responds in minutes. An on-call rotation responds in tens of minutes if you are lucky and hours if you are not.

Where is that person, and what time zone is it for them? A truly 24/7 model has at least two delivery centers in different time zones so that no shift is operating against the clock. Our model has engineers in Minneapolis and a delivery center in Manila, which means a 2 a.m. ticket from a US plant lands in the middle of someone’s shift in the Philippines. They are awake, alert, and not the only engineer covering the queue.

What is the response SLA for a sev-1 outage outside business hours? Real numbers, in writing, with financial remedies if missed. If the SLA is 15 minutes during business hours and four hours overnight, the provider is not running 24/7 support. They are running business-hours support with a courtesy line.

How many concurrent tickets is the on-shift engineer handling? This is a question most providers do not want to answer because it reveals their actual capacity. A single engineer covering an overnight queue for 40 clients is not 24/7 support. They are a hope strategy.

Production-Critical Systems Need Production-Critical Monitoring

Reactive support is not enough for systems that cannot afford to fail. The other half of manufacturing IT is monitoring designed for the systems that actually run the business.

In a typical office environment, monitoring tells you when a server is offline. In a production environment, monitoring needs to tell you when a service inside a server is degraded, when a connector is no longer receiving data, when a queue is backing up, and when latency between the manufacturing execution system and the warehouse management system is climbing. By the time the server itself goes offline, the line has been bleeding for hours.

The systems that need this level of monitoring usually include the ERP and any modules that touch the shop floor, the manufacturing execution system, the warehouse management system, the historian and SCADA platforms, the quality management system, and the integrations between them. These systems also need monitoring on the integration layer itself: API endpoints, message queues, and database replication. A failure between systems is harder to diagnose and just as costly as a failure inside one.

Our infrastructure management practice was built with this kind of telemetry in mind because the office monitoring tooling does not catch the failure modes that matter on a production floor.

OT and IT Are Different Disciplines

Manufacturing IT also has to handle the boundary between operational technology and information technology. The engineers who can troubleshoot a Windows Server outage are not always the engineers who can troubleshoot a PLC communication failure. Many providers blur this distinction in their marketing and discover the gap during the first plant-floor incident.

A real manufacturing support model identifies the OT and IT scope separately, names the partners or staff who cover OT, and defines the escalation path between the two when a problem crosses the line. We typically partner with the customer’s OT team or specialized OT integrator and own the IT side of the boundary explicitly, which is a better answer than pretending we are something we are not.

What to Ask When You Evaluate a Provider for Manufacturing

If you are evaluating a managed services provider for a manufacturing operation, the questions below separate the providers who understand production from the ones who do not.

Walk me through the last production-impacting incident at a manufacturing client. What was the time to acknowledge, time to escalate, and time to resolve? What was the root cause? How was communication handled with the plant manager?

What does your monitoring catch on an ERP-to-MES integration that office monitoring would miss?

How do you handle change management for systems that cannot tolerate unplanned restarts? What is your change advisory board, who sits on it, and what is the approval window for a change that affects a production system?

What is your scope for OT systems, and where does the boundary sit? Who handles a PLC issue at 2 a.m. that turns out to be a network problem and not a controls problem?

How do you staff for shift coverage, and what is the average tenure of your overnight engineers? High overnight turnover means new engineers are learning your environment in real time during incidents.

How Virteva Supports Manufacturing

Manufacturing is a meaningful part of our practice, and our service model was shaped by it. Our 24/7 support is real because we run a hybrid Minneapolis and Manila operation with engineers actively staffed across all shifts. Our service desk is built on ServiceNow with manufacturing-specific monitoring and runbooks for the systems that matter most. We have learned the difference between an office IT incident and a production-impacting incident the hard way, by being the team on the call when the line was down.

If you are running a manufacturing operation and your IT support contract was not designed for production, the gap is going to find you eventually. If you want a practical assessment of where your support model has holes, we are happy to walk through it. Bring your last three production-impacting incidents and we will show you, specifically, where a different support model would have changed the outcome.

Manufacturing IT24/7 IT SupportERP SupportProduction IT

More from the blog

Ready to optimize your Microsoft environment?
Talk to our team about what a managed services partnership looks like for your organization.
Schedule a conversation