Full-Time

PySpark Big Data Senior Developer

Vice President

Posted on 5/9/2026

Citi

Citi

10,001+ employees

Global financial services including banking, investment

Compensation Overview

$120.8k - $170.8k/yr

Mississauga, ON, Canada

In Person

Category
Data & Analytics (1)
Required Skills
Kubernetes
Microsoft Azure
Redshift
Python
Git
BigQuery
Apache Spark
SQL
Apache Kafka
Docker
AWS
Apache Hive
MongoDB
REST APIs
Hadoop
Yarn
DevOps
Databricks
Cassandra
Snowflake
Google Cloud Platform
Requirements
  • 6+ years of extensive, hands-on experience as a Senior Big Data Developer, with a strong emphasis on PySpark and the Apache Spark ecosystem, operating as a player/coach
  • Expert proficiency in Python, with a proven track record of developing robust, scalable, and high-performance PySpark applications for large-scale data processing
  • Deep understanding and extensive hands-on experience with Apache Spark (Spark Core, Spark SQL, Spark Streaming) and its ecosystem
  • Experience with distributed computing frameworks such as Hadoop (HDFS, YARN)
  • Expert proficiency in SQL and extensive experience with data warehousing concepts and technologies (e.g., Hive, Snowflake, Redshift, Databricks SQL)
  • Proven experience with various data storage formats (e.g., Parquet, ORC, Avro) and data lake solutions (e.g., Delta Lake, Iceberg)
  • Experience with NoSQL databases (e.g., MongoDB, Cassandra, HBase) is a significant plus
  • Strong experience with Apache Kafka for building real-time data pipelines and event-driven architectures
  • Demonstrated experience with big data services on major cloud platforms (e.g., AWS EMR/Glue/Redshift, Azure Databricks/Data Factory/Synapse, GCP Dataflow/Dataproc/BigQuery) is highly desirable
  • Proven effectiveness with AI coding tools (e.g., Claude Code, Codex, Antigravity) is a mandatory requirement
  • Strong AI-first mindset and ability to leverage and integrate AI tools into the development workflow for continuous improvement
  • Experience with or willingness to actively explore and implement other AI-powered tools to optimize big data development processes
  • Strong ability to articulate the functional domain being worked in, understanding the business context, and explaining "why" the technical solutions matter
  • Advanced understanding of data structures, algorithms, and performance optimization techniques for large-scale distributed data processing
  • Experience with RESTful API design and development for data ingestion or exposure points
  • Familiarity with containerization technologies (Docker, Kubernetes) for deploying and managing big data applications
  • Expert proficiency with version control systems, especially Git, and advanced branching strategies
  • Exceptional problem-solving, analytical, and debugging skills in highly complex, distributed big data environments
  • Superior communication and interpersonal skills, with a proven ability to work effectively and autonomously within small, high-performing teams, and to mentor others
  • Demonstrated high autonomy and agency in tackling complex challenges and delivering impactful solutions
  • Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or a related quantitative field is required; equivalent practical experience will be considered
Responsibilities
  • Operate end-to-end in the design, development, and implementation of robust big data solutions, ensuring optimal performance, scalability, data quality, and security
  • Collaborate closely within small, co-located squads (4-7 person teams), fostering high communication and low coordination overhead, to translate complex business requirements into technical specifications for big data processing and analytical solutions
  • Act as a player/coach within the team, mentoring junior members and leading by example in the development of efficient and innovative big data architectures
  • Design, develop, and optimize large-scale data pipelines using PySpark for data ingestion, transformation, and aggregation, always with an eye towards efficiency and domain relevance
  • Implement and manage real-time data streaming and event-driven architectures using technologies like Apache Kafka
  • Design and implement sophisticated data warehousing solutions and dimensional models for efficient data storage and retrieval, ensuring alignment with business needs
  • Work with various distributed data storage technologies, including distributed file systems (e.g., HDFS, S3) and NoSQL databases (e.g., MongoDB, Cassandra), selecting the right tool for the right problem
  • Implement efficient data processing and storage strategies to optimize the performance and scalability of big data applications, with a strong focus on the "why" behind the technology choices
  • Champion best practices in software development, including rigorous code reviews, implementing comprehensive testing, and supporting continuous integration and continuous deployment (CI/CD) pipelines
  • Demonstrate high autonomy and agency in driving projects forward, making informed decisions, and proactively identifying areas for improvement
  • Proactively leverage and contribute to the development of AI-powered development tools, including internal Citi AI tools like Copilot, Claude Code, Codex, and Antigravity, to significantly enhance productivity, code quality, and accelerate development cycles
  • Lead technical discussions and contribute strategically to the evolution of our big data technology stack, always seeking innovative approaches
  • Troubleshoot and resolve complex technical issues within big data environments, demonstrating strong analytical and problem-solving skills
Desired Qualifications
  • Cloud platforms experience on major providers (AWS, Azure, GCP) is highly desirable
  • Experience with or willingness to explore AI-powered tools beyond mandatory tools to optimize development processes

Citi provides financial services including consumer banking, credit, investment banking, and wealth management to individuals, corporations, and governments. The company operates by earning interest on loans and collecting fees for managing investments, processing trades, and facilitating cross-border transactions through its digital platforms. Unlike many local banks, Citi maintains a physical and digital presence in over 160 countries, allowing it to serve as a single partner for clients with global financial needs. Its goal is to drive growth and profitability for its clients and shareholders while supporting environmental and social sustainability initiatives.

Company Size

10,001+

Company Stage

IPO

Headquarters

New York City, New York

Founded

1812

Simplify Jobs

Simplify's Take

What believers are saying

  • $30B buyback announced at 2026 Investor Day supports 14-15% ROTCE by 2031.
  • Hired 60 MDs from 20 rivals since 2024 to boost Banking revenues 15% in Q1 2026.
  • TTS and Securities Services growth via tech M&A accelerates embedded banking through 2026.

What critics are saying

  • JPMorgan poaches Citi's cross-border talent, eroding MD retention within 12-24 months.
  • Basel IV phases in 2025-2028 force 15-20% more capital, cutting Markets ROE.
  • Fintechs like Stripe displace Services revenues in 18-36 months via AI automation.

What makes Citi unique

  • Citi refocused in 2024-2025 by exiting 14 consumer franchises to sharpen Services and Markets.
  • Banamex consumer business targets public listing in 2025-2026 while retaining institutional Mexico unit.
  • Three engines—Services, Markets & Banking, Wealth—drive strategy under CEO Jane Fraser.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Citi who can refer or advise you

Benefits

Health Insurance

Dental Insurance

Vision Insurance

Life Insurance

Disability Insurance

401(k) Retirement Plan

401(k) Company Match

Wellness Program

Paid Vacation

Paid Sick Leave

Paid Holidays

Company News

Yahoo Finance
Apr 14th, 2026
Banks report strong profits but warn of rising energy prices hitting consumers

America's largest banks reported strong first-quarter profits driven by robust investment banking activity and a resilient economy, though executives warned about mounting risks from rising energy prices and geopolitical uncertainty. JPMorgan Chase posted a profit of $16.49 billion, up 13% year-on-year, whilst Wells Fargo earned $5.25 billion and Citigroup reported $5.79 billion. Investment banking fees surged, with JPMorgan seeing a 30% jump and Citigroup a 12% increase in advisory fees, fuelled by market volatility and corporate dealmaking. However, JPMorgan CEO Jamie Dimon cautioned about "an increasingly complex set of risks", including wars, energy prices and trade tensions. Wells Fargo noted customers allocating more spending to petrol whilst cutting discretionary purchases, signalling potential downstream economic impacts from elevated oil prices.

The Associated Press
Apr 14th, 2026
Banks report strong Q1 profits but warn rising energy prices threaten consumer spending

America's largest banks reported strong first-quarter profits driven by investment banking activity and a resilient economy, but executives warned about emerging economic headwinds from rising energy prices and geopolitical uncertainty. JPMorgan Chase posted a 13% profit increase to $16.49 billion, with investment banking fees jumping 30%. Wells Fargo earned $5.25 billion whilst Citigroup reported $5.79 billion in profits. The gains came amid market volatility and increased merger activity. However, JPMorgan CEO Jamie Dimon cited "an increasingly complex set of risks" including wars, energy prices and trade tensions. Wells Fargo's CFO noted consumers allocating more spending towards petrol whilst reducing discretionary purchases. Dimon warned that higher oil prices' impact "will likely take some time to materialise" if they persist.

Yahoo Finance
Apr 14th, 2026
Citi stock poised to jump as Wall Street loves the name, says Jim Cramer

Citigroup has raised interest among investors, with Jim Cramer highlighting strong market sentiment towards the stock. Following earnings, Cramer noted that Citigroup is "love, love, love by everybody on Wall Street" and expects the stock to jump higher. The bank delivered solid quarterly results, with 8% revenue growth and 35% earnings per share increase, excluding one-time charges. Net interest income rose 14%, beating expectations. However, results were mixed across divisions, with services, banking and fixed income performing well, whilst equity trading and personal banking fell short. Trading at a significant discount to peers despite rising 66% last year, Citigroup remains attractive. CEO Jane Fraser indicated the bank's transformation efforts are over 80% complete, though questions remain about future growth once self-help measures conclude.

Yahoo Finance
Apr 14th, 2026
Citi beats Q1 profit estimates with $5.8B net income as dealmaking surges 14%

Citigroup beat first-quarter profit estimates on Tuesday, reporting net income of $5.8 billion, or $3.06 per diluted share, compared to $4.1 billion in the prior-year period. The result exceeded analysts' estimate of $2.63 per share. Revenue rose 14% whilst net income grew 42%, driven by strong dealmaking activity. Investment banking fees increased 19% to $1.3 billion, with growth in advisory and equity capital markets. Services revenue climbed 17%, and markets crossed $7 billion in revenue. Global investment banking revenue reached $28.2 billion in the first quarter, the highest since 2021. Chief executive Jane Fraser attributed the performance to softer regulation under President Trump and the AI boom. The bank remains on track to deliver its 10-11% return on tangible common equity target.

Structured Retail Products
Apr 13th, 2026
MerQube secures Series C funding from 7RIDGE and Deutsche Börse to scale derivatives-linked ETF platform

MerQube, a US-based index provider specialising in rules-based and derivatives-enabled strategies, has closed a Series C funding round led by 7RIDGE and Deutsche Börse Group. Existing investors including Allianz Life Ventures, Citi, Intel Capital, J.P. Morgan, Laurion Capital Management and UBS also participated, though the funding amount was not disclosed. The company plans to use the investment to scale its technology platform and expand in derivatives-linked ETF and structured product markets. MerQube focuses on providing customised index solutions and data-driven strategies for institutional clients.