IT Software Engineers
Overview
IT Software Engineers build and maintain the software and automation that keeps platforms and products running—including operating-system adjacent components, tooling, integration, build/release pipelines, and reliability improvements.
They work with system programmers, analysts, architects, and other engineers to design and improve system capabilities, define performance interfaces, and coordinate software installation and change.
Typical responsibilities
Analyze user and business needs and translate them into technical requirements.
Design, implement, and maintain platform-level software components and integrations.
Automate repeatable operational tasks (provisioning, deployments, configuration, compliance checks).
Troubleshoot complex issues using logs, traces, dumps, and performance data.
Plan and execute changes safely (testing, rollout/backout, communication, post-change verification).
Document procedures and mentor others on best practices.
Common job titles (examples)
z/OS Systems Programmer (Base / z/OS): Installs, configures, and maintains the z/OS operating system and core components.
DevOps / CI/CD Engineer (Mainframe): Builds pipelines for mainframe builds/tests/deployments and integrates with enterprise DevOps tooling.
Platform / Automation Engineer: Creates automation for workflows, provisioning, and environment consistency.
ISV Product Engineer: Develops and maintains IBM Z software products for an Independent Software Vendor (ISV) or IBM.
Skills
Mainframe platform fundamentals
z/OS basics (address spaces, system services, started tasks)
Explain why a started task runs continuously while a batch job ends, and where you’d look when it fails to start.
TSO/E and ISPF navigation
Locate a data set/member in ISPF, update it safely, and confirm you saved the change.
UNIX System Services (USS): shell basics, permissions, file transfers
Verify a file’s permissions, adjust them appropriately, and transfer a file between workstation and USS.
Data sets (PS, PDS/PDSE) and basic storage concepts
Identify whether an input is a sequential data set or a PDS/PDSE member and choose the right reference format.
JCL fundamentals and reading job output
Read a job log to determine which step failed and whether it was a JCL error, missing dataset, or program return code.
Basic security concepts (SAF/RACF terminology, least privilege)
Describe what it means to grant access via groups/roles instead of giving an individual broad privileges.
Mainframe tooling & problem determination
Reading system and job output (JES, job logs, SYSLOG/message flow concepts)
Correlate “what changed” with the first error messages that appeared after a deployment or configuration update.
Using common operations views/tools (for example: SDSF for job/spool browsing)
Find a job’s spool output, identify the failing DD name, and capture the key messages for escalation.
Basic dump/trace awareness (what they are used for, how they support root-cause analysis)
Recognize when an abend likely requires a dump/trace to diagnose vs. when a simple config fix is sufficient.
Knowing when to escalate and what evidence to collect (logs, timestamps, job names, recent changes)
Open an incident with the exact job name, step name, time of failure, and the last few relevant messages.
Performance & reliability
Understanding what “good” looks like (baseline behavior, SLAs, peak vs. off-peak)
Compare today’s batch duration to a normal baseline and flag a meaningful deviation.
Basic concepts behind workload and resource management (for example: priorities/service classes)
Explain why two workloads get different response times even on the same system.
Identifying performance symptoms vs. root causes (CPU, I/O, enqueue/locking, storage constraints)
Notice that “high CPU” is a symptom and investigate whether the real bottleneck is I/O waits or contention.
Change, release, and maintenance discipline
Change control fundamentals (risk assessment, rollout/backout plans, verification steps)
Write a simple rollout/backout checklist and define what “success” looks like after the change.
Versioning and promotion practices across environments (dev/test/stage/prod)
Verify that the artifact/config promoted to production is the same version that passed testing.
Awareness of platform maintenance concepts (PTFs/fixes, dependencies, maintenance windows)
Identify that a fix has prerequisites and plan its installation during a maintenance window.
Engineering skills (portable across platforms)
Requirements gathering and translating needs into technical designs
Turn “we need faster deployments” into specific requirements (approvals, test gates, rollback, audit evidence).
Source control and collaboration (Git, pull requests, code review)
Create a small, reviewable change with a clear description and accept feedback via a pull request.
Testing mindset (unit/integration/regression) and change control discipline
List the minimum regression checks that must pass before promoting a change.
Troubleshooting and root-cause analysis
Reproduce an issue in a lower environment, isolate variables, and document the confirmed root cause.
Clear written documentation and cross-team communication
Write a runbook entry that enables someone else to resolve the issue without asking you.
Languages & automation (commonly seen)
Scripting: REXX, shell, Python (where available)
Automate a repetitive check (e.g., validate that required files/datasets exist before running a job).
General-purpose: Java, C/C++ (varies by shop)
Build a small utility that parses logs or transforms data for a batch pipeline.
Automation & APIs: z/OSMF workflows/APIs, Zowe CLI, configuration automation tooling
Run an automated workflow to provision resources and then verify the expected artifacts were created.
Specific tools and languages vary widely by employer category and by whether the role is enterprise IT vs. ISV product engineering.
Next steps (recommended learning path)
Learn “how work gets done” on the platform
Learning resources
Roles and Categories (how this role fits into the overall role taxonomy)
Get Ready: Talk Like a Mainframer (glossary-level familiarity)
Category
Role type: IT Software Engineering (see Roles and Categories)
Employer categories: commonly found across multiple categories (see Category Definitions)
Last updated
Was this helpful?