Research Article
BibTex RIS Cite

An Application with Python Software for the Classification of Chemical Data

Year 2023, Issue: 1, 49 - 68, 15.08.2023
https://doi.org/10.26650/JODA.1264915

Abstract

Nowadays, much data can be generated and stored by chemical analyses. It is possible to evaluate these data, to reveal the relationships between them, and to make predictions with new data measured based on these relationships thanks to data mining algorithms. Monitoring the treatment processes and providing the necessary controls for environmental studies are based on the continuous determination of wastewater and activated sludge characteristics. The main criteria for determining the properties of wastewater are biochemical oxygen demand (BOD5), chemical oxygen demand (COD), total organic carbon (TOC), and dissolved oxygen (DO). Among these parameters, BOD5 measurement takes 5 days, while the others can be measured within 1-2 hours at most. Since BOD5 values can be mathematically correlated with other parameters, estimating them in a short time will provide a great advantage in terms of process control. In this study, a data set was created by measuring the specified parameters from 334 samples taken from a treatment plant for statistical evaluation, and the interactions of the parameters in this data set with each other were analyzed by the decision tree method. Thus, by considering the weighted effects of the parameters, it was tried to predict the probable BOD5 value of an unknown sample. The algorithm selected for this data mining study was modeled with PYTHON software and the performance of the algorithm in the estimation of the BOD5 parameter depending on other parameters was examined by extracting decision tree rules.

References

  • Activestate. (2022). How to Classify Data In Python using Scikit-learn. Retrieved May 3, 2023, from https://www. activestate.com/resources/quick-reads/how-to-classify-data-in-python/ google scholar
  • Alan, A., & Karabatak, B. (2020). Veri Seti - Sınıflandırma İlişkisinde Performansa Etki Eden Faktörlerin Değerlendirilmesi, Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 32(2), 531-540. google scholar
  • Amazon, (2016). Retrieved May 3, 2023, from https://www.amazon.com/Hach-8505700-Measurement-Luminescent-Dissolved/dp/B00R3EGHJ4 google scholar
  • Anaconda. (2022). anaconda/packages/python. https://anaconda.org/anaconda/python/anaconda/packages/ (python3.10.6) google scholar
  • Çelik, M. (2009). Veri Madenciliğinde Kullanılan Sınıflandırma Yöntemleri ve Bir Uygulama [Yüksek Lisans Tezi]. İstanbul Üniversitesi Sosyal Bilimler Enstitüsü Ekonometri Ana Bilim Dalı. google scholar
  • Çınar, A. (2019). Veri Madenciliğinde Sınıflandırma Algoritmalarının Performans Değerlendirmesi ve R Dili ile Bir Uygulama, Marmara Üniversitesi Öneri Dergisi, 14(51), 90-111. google scholar
  • Doğan, O. (2017). Ücretsiz Veri Madenciliği Araçları ve Türkiyede Bilinirlikleri Üzerine Bir Araştırma, Ege Stratejik Araştırmalar Dergisi, 8(1), 77-93. google scholar
  • Eltem, R. (2001). Atık Sular ve Arıtım, Ege Üniversitesi Fen Fakültesi Yayınları, 172 google scholar
  • Güller, S., Silahtaroğlu, G. ve Akpolat, O. (2019). Analysis waste water characteristics via data mining: A Muğla province case and external validation. Communications in Statistics Case Studies Data Analysis and Applications, 5(3), 200-213. https://dx.doi.org/10.1080/23737484.2019.1604192 google scholar
  • Jiawei, H., Kamber, M., & Pei, J. (2012). Data Mining; Concepts and Technics, Morgan Kaufmann Publishers, Elsevier Inc. google scholar
  • Kacur, T., M. (2020). Atık Su ve Aktif Çamur Karakteristiklerinin Tahmininde Karar Ağaçları ve Yapay Sinir Ağlarının Karşılaştırılması [Yüksek Lisans Tezi]. Muğla Sıtkı Koçman Üniversitesi Çevre Bilimleri Ana Bilim Dalı. google scholar
  • Li, S. (2017). Solving A Simple Classification Problem with PYTHON — Fruits Lovers’ Edition. Retrieved May 3, 2023, from https://towardsdatascience.com/solving-a-simple-classification-problem-with-python-fruits-lovers-edition-d20ab6b071d2 google scholar
  • Meyers, D.N., & Wilde, F. D. (2012). USGS TWRI Book 9-A7 (Third Edition), http://water.usgs.gov/owq/FieldManual/ Chapter7/NFMChap7.pdf google scholar
  • Mukhtarov, M. (2020). Atık Su ve Aktif Çamur Karakteristiklerinin Sınıflandırılması ve Uygulanan Analiz Yöntemlerinin Değerlendirilmesi [Yüksek Lisans Tezi]. Muğla Sıtkı Koçman Üniversitesi Çevre Bilimleri Ana Bilim Dalı. google scholar
  • Nelson, D. (2022). Overview of Classification Methods in PYTHON with Scikit-Learn. Retrieved May 3, 2023, from https://stackabuse.com/overview-of-classification-methods-in-python-with-scikit-learn/ google scholar
  • Qiao, J., Li, W., & Han, H. (2014). Soft Computing of Biochemical Oxygen Demand Using an Improved T-S Fuzzy Neural Network, Chinese Journal of Chemical Engineering, 22, 1254-1259. google scholar
  • Robinson, S. (2022). Decision Trees in PYTHON with Scikit-Learn. Retrieved May 3, 2023, from https://stackabuse. com/decision-trees-in-python-with-scikit-learn/ google scholar
  • Sampaio, C., & Landup, D. (2022). Linear Regression in PYTHON with Scikit-Learn. Retrieved May 3, 2023, from https://stackabuse.com/linear-regression-in-python-with-scikit-learn/ google scholar
  • Silahtaroğlu, G. (2016). Veri Madenciliği Kavram ve Algoritmaları, (2. Baskı). Papatya Yayıncılık. google scholar
  • Synder, R., & Wyant, D. (2018). Activated Sludge Process Control Training Manuel, DEO, Water Resources Division.Retrieved May 3, 2023, from https://www.michigan.gov/documents/deq google scholar
  • Tchobanogluos, G., & Burton, F. L. (1991). Wastewater Engineering Treatment, Disposal, and Reuse, McGraw-Hill Book Co. google scholar
  • Toprak, H. (2018). Aktif Çamur Sürecinin Tanımı. Retrieved May 3, 2023, from http://web.deu.edu.tr/atiksu/ana58/ aktifkurs.doc google scholar
  • Weka. (2019). Weka. Retrieved May 3, 2023, from https://www.cs.waikato.ac.nz/ml/weka/index.html google scholar
  • Wikipedia, (2016). Biochemical oxygen demand. Retrieved May 3, 2023, from https://en.wikipedia.org/wiki/Biochemical_oxygen_demand google scholar
Year 2023, Issue: 1, 49 - 68, 15.08.2023
https://doi.org/10.26650/JODA.1264915

Abstract

References

  • Activestate. (2022). How to Classify Data In Python using Scikit-learn. Retrieved May 3, 2023, from https://www. activestate.com/resources/quick-reads/how-to-classify-data-in-python/ google scholar
  • Alan, A., & Karabatak, B. (2020). Veri Seti - Sınıflandırma İlişkisinde Performansa Etki Eden Faktörlerin Değerlendirilmesi, Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 32(2), 531-540. google scholar
  • Amazon, (2016). Retrieved May 3, 2023, from https://www.amazon.com/Hach-8505700-Measurement-Luminescent-Dissolved/dp/B00R3EGHJ4 google scholar
  • Anaconda. (2022). anaconda/packages/python. https://anaconda.org/anaconda/python/anaconda/packages/ (python3.10.6) google scholar
  • Çelik, M. (2009). Veri Madenciliğinde Kullanılan Sınıflandırma Yöntemleri ve Bir Uygulama [Yüksek Lisans Tezi]. İstanbul Üniversitesi Sosyal Bilimler Enstitüsü Ekonometri Ana Bilim Dalı. google scholar
  • Çınar, A. (2019). Veri Madenciliğinde Sınıflandırma Algoritmalarının Performans Değerlendirmesi ve R Dili ile Bir Uygulama, Marmara Üniversitesi Öneri Dergisi, 14(51), 90-111. google scholar
  • Doğan, O. (2017). Ücretsiz Veri Madenciliği Araçları ve Türkiyede Bilinirlikleri Üzerine Bir Araştırma, Ege Stratejik Araştırmalar Dergisi, 8(1), 77-93. google scholar
  • Eltem, R. (2001). Atık Sular ve Arıtım, Ege Üniversitesi Fen Fakültesi Yayınları, 172 google scholar
  • Güller, S., Silahtaroğlu, G. ve Akpolat, O. (2019). Analysis waste water characteristics via data mining: A Muğla province case and external validation. Communications in Statistics Case Studies Data Analysis and Applications, 5(3), 200-213. https://dx.doi.org/10.1080/23737484.2019.1604192 google scholar
  • Jiawei, H., Kamber, M., & Pei, J. (2012). Data Mining; Concepts and Technics, Morgan Kaufmann Publishers, Elsevier Inc. google scholar
  • Kacur, T., M. (2020). Atık Su ve Aktif Çamur Karakteristiklerinin Tahmininde Karar Ağaçları ve Yapay Sinir Ağlarının Karşılaştırılması [Yüksek Lisans Tezi]. Muğla Sıtkı Koçman Üniversitesi Çevre Bilimleri Ana Bilim Dalı. google scholar
  • Li, S. (2017). Solving A Simple Classification Problem with PYTHON — Fruits Lovers’ Edition. Retrieved May 3, 2023, from https://towardsdatascience.com/solving-a-simple-classification-problem-with-python-fruits-lovers-edition-d20ab6b071d2 google scholar
  • Meyers, D.N., & Wilde, F. D. (2012). USGS TWRI Book 9-A7 (Third Edition), http://water.usgs.gov/owq/FieldManual/ Chapter7/NFMChap7.pdf google scholar
  • Mukhtarov, M. (2020). Atık Su ve Aktif Çamur Karakteristiklerinin Sınıflandırılması ve Uygulanan Analiz Yöntemlerinin Değerlendirilmesi [Yüksek Lisans Tezi]. Muğla Sıtkı Koçman Üniversitesi Çevre Bilimleri Ana Bilim Dalı. google scholar
  • Nelson, D. (2022). Overview of Classification Methods in PYTHON with Scikit-Learn. Retrieved May 3, 2023, from https://stackabuse.com/overview-of-classification-methods-in-python-with-scikit-learn/ google scholar
  • Qiao, J., Li, W., & Han, H. (2014). Soft Computing of Biochemical Oxygen Demand Using an Improved T-S Fuzzy Neural Network, Chinese Journal of Chemical Engineering, 22, 1254-1259. google scholar
  • Robinson, S. (2022). Decision Trees in PYTHON with Scikit-Learn. Retrieved May 3, 2023, from https://stackabuse. com/decision-trees-in-python-with-scikit-learn/ google scholar
  • Sampaio, C., & Landup, D. (2022). Linear Regression in PYTHON with Scikit-Learn. Retrieved May 3, 2023, from https://stackabuse.com/linear-regression-in-python-with-scikit-learn/ google scholar
  • Silahtaroğlu, G. (2016). Veri Madenciliği Kavram ve Algoritmaları, (2. Baskı). Papatya Yayıncılık. google scholar
  • Synder, R., & Wyant, D. (2018). Activated Sludge Process Control Training Manuel, DEO, Water Resources Division.Retrieved May 3, 2023, from https://www.michigan.gov/documents/deq google scholar
  • Tchobanogluos, G., & Burton, F. L. (1991). Wastewater Engineering Treatment, Disposal, and Reuse, McGraw-Hill Book Co. google scholar
  • Toprak, H. (2018). Aktif Çamur Sürecinin Tanımı. Retrieved May 3, 2023, from http://web.deu.edu.tr/atiksu/ana58/ aktifkurs.doc google scholar
  • Weka. (2019). Weka. Retrieved May 3, 2023, from https://www.cs.waikato.ac.nz/ml/weka/index.html google scholar
  • Wikipedia, (2016). Biochemical oxygen demand. Retrieved May 3, 2023, from https://en.wikipedia.org/wiki/Biochemical_oxygen_demand google scholar
There are 24 citations in total.

Details

Primary Language English
Subjects Software Engineering (Other)
Journal Section Research Articles
Authors

Gonca Ertürk 0000-0002-8821-0330

Oğuz Akpolat 0000-0002-6623-4323

Publication Date August 15, 2023
Published in Issue Year 2023 Issue: 1

Cite

APA Ertürk, G., & Akpolat, O. (2023). An Application with Python Software for the Classification of Chemical Data. Journal of Data Applications(1), 49-68. https://doi.org/10.26650/JODA.1264915