Transfer learning not improving accuracy

I have created a binary classification, neural network model for my dataset (with 800000 entries) which comprises of malicious (with different attack types) and benign data traffic. I am looking to use this as a pre-trained model and apply transfer learning on a small dataset which consists of a new attack type (classified as malicious) data and some benign data.

Here is the code for the pre-trained model:

preTrain_model= Sequential()
preTrain_model.add(tf.keras.layers.Dense(20, activation='relu', input_dim=11))
preTrain_model.add(tf.keras.layers.Dense(20, activation='relu'))
preTrain_model.add(tf.keras.layers.Dense(2, activation='softmax'))
preTrain_model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd',
metrics=['accuracy']), Y_train, validation_data= (X_testing,Y_test) ,epochs=1, batch_size=1, verbose=2, callbacks= [monitor])
score1= preTrain_model.evaluate(X_training, Y_train)
preTrain_model.layers[-1].trainable= False

for layer in preTrain_model.layers:
    print(layer, layer.trainable)

The model trains well and gives accuracy of around 93%.

However, the accuracy of new model using the pre-trained model is the same as the accuracy of a model only trained on the new data set (around 57%).

Here is the model using the pre-trained model:

myadam= tf.keras.optimizers.Adam(learning_rate=0.001)

model6= Sequential()
model6.add(Dense(20, activation='relu', input_dim=11))
model6.add(Dense(20, activation='relu'))
model6.add(Dense(2, activation='sigmoid'))
model6.compile(loss='sparse_categorical_crossentropy', optimizer=myadam,
              metrics=['accuracy']), Y_train_6, validation_data=(X_test_6,Y_test_6), epochs=20, batch_size=5, verbose=2)
score62= model6.evaluate(X_test_6, Y_test_6)
print('Test Accuracy: %0.2f%%' % (score62[1] * 100))

Here is a model without using the pre-trained model:

model61.add(Dense(20, activation='relu', input_dim=11))
model61.add(Dense(20, activation='relu'))
model61.add(Dense(2, activation='softmax'))
model61.compile(loss='sparse_categorical_crossentropy', optimizer=myadam,
              metrics=['accuracy']), Y_train_6, validation_data= (X_test_6, Y_test_6),epochs=20, batch_size=5, verbose=2)

score621= model61.evaluate(X_test_6, Y_test_6)
print('Test Accuracy: %0.2f%%' % (score621[1] * 100))

I am currently learning about transfer learning and tensorflow/keras, and any guidance or a direction would be highly appreciated. Also, let me know if more clarity is required. Thank you.

Have you tried freezing the early layers of the pre-trained models and only train the later layers.Basically it is seen that first layers learn to identify the basic features . These layers seem to capture the features that are broadly useful for analyzing data. The layers later identify more complex and specific features. So the earlier layers act as feature extractor which passes the features to the layers ahead.