Back to search
SLB Linkedin · Posted 16d ago

Data Scientist

Houston, Texas, United States

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Job Description

Build, train, and deploy large-scale, self-supervised "foundation" models that learn rich representations of time series, sequential sensor data in addition to textual and vision data, to be fine-tuned for tasks such as anomaly/event detection, predictive maintenance, forecasting, classification, or multi-modal sensor fusion for industrial and scientific applications.

Data/Signal Processing

  • Time Series & Sequential Data: processing, augmentation, feature engineering for financial, industrial, IoT, medical, or other sensor streams (univariate/multivariate time series).
  • Sensor Data Analysis: expertise with diverse sensor modalities (e.g., accelerometers, temperature, vibration, audio, images), sampling rates, synchronization, and real-world noise/artifact handling.
  • Multi-Modality Learning: integrating heterogeneous data types (time series, images, text, audio, structured) into robust deep learning architectures; cross-modal representation learning.

Machine Learning & Foundation Model Expertise

  • Self-supervised and Semi-supervised Learning: time series foundation models, masked modeling, contrastive methods, temporal predictive coding, multimodal alignment and fusion.
  • Model Architectures: sequence models (RNNs, GRU/LSTM, TCN), 1D/2D/3D CNNs, Transformers (BERT, ViT, TimeSFormer), graph neural networks, diffusion/generative models, multi-modal/fusion encoders.
  • Transfer Learning & Fine-Tuning at Scale: prompt/adapter-based strategies, temporal domain adaptation, few-shot learning for specialized tasks.
  • Evaluation Metrics: regression/classification (MSE, F1, AUC), time series similarity (DTW, correlation), event detection/segmentation (IoU, accuracy), business/end-user KPIs.

Software & Infrastructure

  • Programming: expert Python (NumPy, SciPy, Pandas), C++/CUDA for custom kernels and high-performance preprocessing.
  • Deep Learning Frameworks: PyTorch (Lightning, Distributed), TensorFlow/Keras, JAX/Flax.
  • Large-scale Training: multi-GPU, multi-node clusters, mixed-precision, ZeRO optimization, scalable data loaders for long sequences.
  • Data Engineering: robust pipelines for ingesting, cleaning, segmenting, and aligning large-scale, time-synchronized multi-sensor datasets.

Mathematical & Algorithmic Foundations

  • Linear Algebra, Probability & Statistics, Optimization (stochastic, convex/non-convex, Bayesian).
  • Signal Processing: Fourier/wavelet analysis, filters (Kalman, Savitzky–Golay), resampling, noise modeling.
  • Numerical Methods: ODE/PDE solvers, inverse problems, regularization, time-frequency methods for complex systems.

Collaboration & Communication ]], >

  • Cross-disciplinary teamwork with domain experts, engineers, product owners, and end-users from industrial, scientific, or medical backgrounds.
  • Clear presentation of complex model behaviors (interpretability, attention analysis), uncertainty quantification, and value impact.
  • MS / Ph.D. in computer science, data science and AI or related fields.
  • 3+ years of relevant experience in data science and AI or related fields.
Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.

Unlock free search