Close Menu
5gantennas.org5gantennas.org
  • Home
  • 5G
    • 5G Technology
  • 6G
  • AI
  • Data
    • Global 5G
  • Internet
  • WIFI
  • 5G Antennas
  • Legacy

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
5gantennas.org5gantennas.org
  • Home
  • 5G
    1. 5G Technology
    2. View All

    Deutsche Telekom to operate 12,500 5G antennas over 3.6 GHz band

    August 28, 2024

    URCA Releases Draft “Roadmap” for 5G Rollout in the Bahamas – Eye Witness News

    August 23, 2024

    Smart Launches Smart ZTE Blade A75 5G » YugaTech

    August 22, 2024

    5G Drone Integration Denmark – DRONELIFE

    August 21, 2024

    Hughes praises successful private 5G demo for U.S. Navy

    August 29, 2024

    GSA survey reveals 5G FWA has become “mainstream”

    August 29, 2024

    China Mobile expands 5G Advanced, Chunghwa Telecom enters Europe

    August 29, 2024

    Ateme and ORS Boost 5G Broadcast Capacity with “World’s First Trial of IP-Based Statmux over 5G Broadcast” | TV Tech

    August 29, 2024
  • 6G

    India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

    August 29, 2024

    Vodafonewatch Weekly: Rural 4G, Industrial 5G, 6G Patents | Weekly Briefing

    August 29, 2024

    Southeast Asia steps up efforts to build 6G standards

    August 29, 2024

    Energy efficiency as an inherent attribute of 6G networks

    August 29, 2024

    Finnish working group launches push for 6G technology

    August 28, 2024
  • AI

    Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

    August 29, 2024

    Why Honeywell is betting big on Gen AI

    August 29, 2024

    Ethically questionable or creative genius? How artists are engaging with AI in their work | Art and Design

    August 29, 2024

    “Elon Musk and Trump” arrested for burglary in disturbing AI video

    August 29, 2024

    Nvidia CFO says ‘enterprise AI wave’ has begun and Fortune 100 companies are leading the way

    August 29, 2024
  • Data
    1. Global 5G
    2. View All

    Global 5G Enterprise Market is expected to be valued at USD 34.4 Billion by 2032

    August 12, 2024

    Counterpoint predicts 5G will dominate the smartphone market in early 2024

    August 5, 2024

    Qualcomm’s new chipsets will power affordable 5G smartphones

    July 31, 2024

    Best Super Fast Download Companies — TradingView

    July 31, 2024

    Crypto Markets Rise on Strong US Economic Data

    August 29, 2024

    Microsoft approves construction of third section of Mount Pleasant data center campus

    August 29, 2024

    China has invested $6.1 billion in state-run data center projects over two years, with the “East Data, West Computing” initiative aimed at capitalizing on the country’s untapped land.

    August 29, 2024

    What is the size of the clinical data analysis solutions market?

    August 29, 2024
  • Internet

    NATO believes Russia poses a threat to Western internet and GPS services

    August 29, 2024

    Mpeppe grows fast, building traction among Internet computer owners

    August 29, 2024

    Internet Computer Whale Buys Mpeppe (MPEPE) at 340x ROI

    August 29, 2024

    Long-term internet computer investor adds PEPE rival to holdings

    August 29, 2024

    Biden-Harris Administration Approves Initial Internet for All Proposals in Mississippi and South Dakota

    August 29, 2024
  • WIFI

    4 Best Wi-Fi Mesh Networking Systems in 2024

    September 6, 2024

    Best WiFi deal: Save $200 on the Starlink Standard Kit AX

    August 29, 2024

    Sonos Roam 2 review | Good Housekeeping UK

    August 29, 2024

    Popular WiFi extender that eliminates dead zones in your home costs just $12

    August 29, 2024

    North American WiFi 6 Mesh Router Market Size, Share, Forecast, [2030] – அக்னி செய்திகள்

    August 29, 2024
  • 5G Antennas

    Nokia and Claro bring 5G to Argentina

    August 27, 2024

    Nokia expands FWA portfolio with new 5G devices – SatNews

    July 25, 2024

    Deutsche Telekom to operate 12,150 5G antennas over 3.6 GHz band

    July 24, 2024

    Vodafone and Ericsson develop a compact 5G antenna in Germany

    July 12, 2024

    Vodafone and Ericsson unveil new small antennas to power Germany’s 5G network

    July 11, 2024
  • Legacy
5gantennas.org5gantennas.org
Home»AI»Is transforming Transformers the next frontier for generative AI?
AI

Is transforming Transformers the next frontier for generative AI?

5gantennas.orgBy 5gantennas.orgAugust 18, 2024No Comments8 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email

To receive industry-leading AI updates and exclusive content, sign up for our daily and weekly newsletters. Learn more


The Transformer architecture powers today’s most popular public and private AI models. So what’s next? Is this the architecture that leads to better inference? What’s next for Transformers? Today, to build intelligence into models, they require large amounts of data, GPU computing power, and rare talent. As such, models are typically expensive to build and maintain.

The adoption of AI began with making simple chatbots more intelligent. Now, startups and large enterprises have figured out how to package intelligence in the form of a co-pilot that augments human knowledge and skills. The next natural step is to package multi-step workflows, memory, personalization, and more in the form of an agent that can solve use cases for multiple functions, including sales and engineering. The hope is that with a simple prompt from the user, the agent will be able to classify the intent and break down the goal into multiple steps to complete the task, whether that be searching the internet, authenticating to multiple tools, or learning from past repetitive behavior.

When we apply these agents to consumer use cases, we see a future where everyone has a personal agent like Jarvis on their phone that understands them. Want to book a trip to Hawaii, order food at your favorite restaurant, or manage your personal finances? A future where you can use a personalized agent to securely manage these tasks is possible, but from a technology perspective, we are still a long way from that future.

Is transformer architecture the final frontier?

The self-attention mechanism in the Transformer architecture allows the model to evaluate the importance of each input token simultaneously with respect to all tokens in the input sequence. This improves the model’s language and computer vision understanding by capturing long-distance dependencies and complex token relationships. However, long sequences (e.g. DNA) increase computational complexity, resulting in poor performance and high memory consumption. Solutions and research approaches to solving the long sequence problem include:

  • Transformer improvements on hardwareA promising technique here is FlashAttention. The paper claims that Transformer performance can be improved by carefully managing reads and writes to different levels of fast and slow memory on the GPU. This is achieved by making the attention algorithm IO-aware, reducing the number of reads/writes between the GPU’s high-bandwidth memory (HBM) and static random access memory (SRAM).
  • General notes: The self-attention mechanism has complexity O(n^2), where n is the length of the input sequence. Is there a way to reduce this quadratic computational complexity to linear, allowing the transformer to better handle long sequences? Optimizations here include techniques such as reformers, performers, and skyformers.

In addition to these optimizations to reduce the Transformer’s complexity, several alternative models are challenging the Transformer’s dominance (although most are still in their infancy).

  • State Space Models: These are a class of models related to recurrent (RNN) and convolutional (CNN) neural networks that compute with linear or near-linear complexity on long sequences. State-space models (SSMs) like Mamba can handle long-distance relationships better, but lag behind Transformers in terms of performance.

These research approaches have now left university labs and are available in the public domain in the form of new models for anyone to try. Additionally, the latest model releases inform the state of the underlying technology and viable paths for alternatives to Transformer.

Featured model release

The latest and greatest models continue to be released from regular contributors like OpenAI, Cohere, Anthropic, Mistral, etc. Meta’s underlying models for compiler optimization are notable for the effectiveness of code and compiler optimizations.

In addition to the mainstream Transformer architecture, production-grade State Space Models (SSM), Hybrid SSM Transformer models, Mixture of Experts (MoE), and Composition of Experts (CoE) models are now emerging that appear to perform better across multiple benchmarks when compared to state-of-the-art open source models.

  • Databricks Open Source DBRX Model: This MoE model has 132B parameters. It has 16 experts, 4 of which are active simultaneously during inference or training. It supports a 32K context window and the model was trained with 12T tokens. Other interesting details: It took 3 months, $10M, and 3,072 Nvidia GPUs connected with 3.2Tbps InfiniBand to complete pre-training, post-training, evaluation, red teaming, and model refinement.
  • SambaNova Systems Release Samba CoE v0.2: This CoE model consists of five 7B parameter experts, only one of which is active during inference. All experts are open source models, and along with the experts the model has a router. The router understands which model is best suited for a given query and routes the request to that model. It is very fast, generating 330 tokens/sec.
  • AI21 Lab Release Jumba This is a hybrid Transformer and Mamba MoE model. It is the first production-grade Mamba-based model with elements of the traditional Transformer architecture. “The Transformer model has two drawbacks. First, its high memory and computing requirements prevent it from handling long contexts, making the key-value (KV) cache size a limiting factor. Second, since there is no single summary state, each generated token performs computations across the context, slowing down inference and reducing throughput.” SSMs such as Mamba can handle long-distance relationships better, but they lag behind Transformers in performance. Jamba compensates for the inherent limitations of the pure SSM model, providing a 256K context window and fitting 140K contexts on a single GPU.

Challenges for corporate adoption

While there is great excitement surrounding the latest research and modeling that supports Transformer architecture as the next frontier, we must also consider the technical challenges that are preventing companies from realizing its benefits.

  • Frustrated by lack of enterprise features: Imagine selling to a CXO without simple features like Role-Based Access Control (RBAC), Single Sign-On (SSO), and access to logs (both prompt and output). Today’s models may not be enterprise-ready, but enterprises are budgeting separately to ensure they don’t miss out on the next big thing.
  • Destroying what once worked: AI copilots and agents make securing data and applications complicated. Imagine a simple use case: the video conferencing app you use every day introduces an AI summarization feature. As a user, you might appreciate the ability to get a transcript after the meeting, but in a highly regulated industry, this enhancement could suddenly become a nightmare for your CISO. In effect, something that previously worked fine breaks and must undergo additional security review. When SaaS apps introduce features like this, enterprises need guardrails to ensure data privacy and compliance.
  • The ongoing battle between RAG and tweaks: It is possible to deploy both together or neither without sacrificing too much. Search Augmentation Generation (RAG) can be thought of as a way to ensure that facts are presented correctly and information is up to date. Fine-tuning, on the other hand, can be thought of as what results in the best model quality. Fine-tuning is difficult, which is why some model vendors discourage it. It also includes challenges of overfitting that negatively impact model quality. Fine-tuning seems to be under pressure from multiple sides. As the model context window expands and token costs fall, RAG could become a better deployment option for enterprises. In the context of RAG, the recently released Command R+ model from Cohere is the first open-weight model to beat GPT-4 in the chatbot space. Command R+ is a state-of-the-art RAG-optimized model designed to power enterprise-grade workflows.

I recently spoke with an AI leader at a major financial institution who claimed that the future doesn’t belong to software engineers, but to creative English/Arts students who can create effective prompts. There may be some truth in this comment; with a quick sketch and a multimodal model, even non-technical people can create simple applications without too much effort. Knowing how to use such tools is a superpower and can be useful to anyone looking to succeed in their career.

The same is true for researchers, practitioners, and founders. Today, there are multiple architectures to choose from to make the underlying model cheaper, faster, and more accurate. Today, there are many ways to modify the model for a specific use case, including fine-tuning techniques and new breakthroughs such as Direct Preference Optimization (DPO), an algorithm that can be considered as an alternative to Reinforcement Learning with Human Feedback (RLHF).

There’s a lot of rapid change happening in the field of generative AI, and it can feel overwhelming for founders and buyers alike to prioritize. I’m excited to see what those building something new come up with next.

Ashish Kakran is a Principal at Thomvest Ventures focused on investing in early stage Cloud, Data/ML and Cybersecurity startups.

Data Decision Maker

Welcome to the VentureBeat community!

DataDecisionMakers is a place where experts, including technologists working with data, can share data-related insights and innovations.

If you want to hear about cutting edge ideas, updates, best practices, and the future of data and data technology, join DataDecisionMakers.

You might also consider contributing your own article.

Learn more about DataDecisionMakers



Source link
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleJD Vance denounces ‘fake polls’ when asked about Trump’s declining election data
Next Article Pakistan’s IT minister denies reports that the government is throttling the internet, calling them “completely false.”
5gantennas.org
  • Website

Related Posts

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024

Why Honeywell is betting big on Gen AI

August 29, 2024

Ethically questionable or creative genius? How artists are engaging with AI in their work | Art and Design

August 29, 2024

Comments are closed.

Latest Posts

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024

Crypto Markets Rise on Strong US Economic Data

August 29, 2024
Don't Miss

Business News | Communications Minister Scindia promotes 6G leadership and nationwide broadband in meeting with telecom operators

By 5gantennas.orgAugust 24, 2024

New Delhi [India]August 24 (ANI): Union Telecom Minister Jyotiraditya Scindia along with Minister of State…

SingTel and SK Telecom prepare for the 6G future

July 8, 2024

Apple focuses on 6G for future iPhones

December 11, 2023

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to 5GAntennas.org, your reliable source for comprehensive information on 5G technology, artificial intelligence (AI), and data-related advancements. We are passionate about staying at the forefront of these cutting-edge fields and bringing you the latest insights, trends, and developments.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024
Most Popular

Will 5G make 2024 the most connected year in the industry?

December 1, 2023

The current state of 5G in the US and how it can improve

September 28, 2023

How 5G technology will transform gaming on the go

January 31, 2024
© 2025 5gantennas. Designed by 5gantennas.
  • Home
  • About us
  • Contact us
  • DMCA
  • Privacy Policy
  • About Creator

Type above and press Enter to search. Press Esc to cancel.