Formal description of activation functions of machine learning models
DOI:
https://doi.org/10.20873/uft.2675-3588.2025.v6n1.p9-18Keywords:
Artificial Intelligence, Applied Mathematics, Activation Functions, ReLU, Sigmoid, TanH, Softmax, Gradient DescentAbstract
Artificial intelligence models are increasingly common in many aspects of everyday life, not only in the most emblematic cases, but also in everyday cases such as recommendation systems on shopping websites. In this sense, developers need to understand how these models work. However, the massive use of libraries to use these models can hinder this understanding. Therefore, this work defines and formally demonstrates activation functions in machine learning models. This is a fundamental point for introducing the subject to new developers, and scientists, who will work in the area. The formal description applies to the classic ReLU, Sigmóide, hyperbolic tangent, softmax and gradient descent functions. In addition, the impact of these functions on the LeNet-5 model applied to the MNIST database is also discussed.
Published
How to Cite
License
Copyright (c) 2024 Julia Assunção Leal

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish in this journal agree to the following terms:
- Authors retain copyright and grant the journal the right of first publication, with work simultaneously licensed under the Creative Commons Attribution License (CC BY-NC 4.0), allowing work sharing with acknowledgment of the work's authorship and initial publication in this journal. ;
- Authors are authorized to enter additional contracts separately for the non-exclusive distribution of the version of the work published in this journal (eg, publishing in an institutional repository or as a book chapter), with acknowledgment of authorship and initial publication in this journal;
- Authors are allowed and encouraged to post and distribute their work online (eg, in institutional repositories or on their personal page) at any point after the editorial process;
- In addition, the AUTHOR is informed and agrees with the journal that, therefore, his paper may be incorporated by the AJCEAM into existing or existing scientific information systems and databases (indexers and databases). in the future (indexers and future databases), under the conditions defined by the latter at all times, which will involve at least the possibility that the holders of these databases may perform the following actions on the paper:
- Reproduce, transmit and distribute the paper in whole or in part in any form or means of existing or future electronic transmission, including electronic transmission for research, viewing and printing purposes;
- Reproduce and distribute all or part of the article in print;
- Translate certain parts of the paper;
- Extract figures, tables, illustrations, and other graphic objects and capture metadata, captions, and related article for research, visualization, and printing purposes;
- Transmission, distribution, and reproduction by agents or authorized by the owners of database distributors;
- The preparation of bibliographic citations, summaries and indexes and related capture references from selected parts of the paper;
- Scan and/or store electronic article images and text.